Investigating High-Frequency Trading (HFT) Around the Extreme Price Movements in Borsa Istanbul

Finance Market Microstructure

In this study, I analyzed the existence of high-frequency trading (HFT) in Borsa Istanbul. My focus is it’s the behavior and market share of high-frequency trading during extreme price movements (EPM).

Irem Dastan
2022-05-08
library(tidyverse)
library(knitr)
library(kableExtra)
library(ggalt)
library(lubridate)
library(stargazer)
library(fastDummies)

data <- read.csv("data.csv") #You can access the data on Github.
names(data)[1] <- "stock" 

In this study, I analyzed the existence of high-frequency trading (HFT) in Borsa Istanbul. My focus is it’s the behavior and market share of high-frequency trading during extreme price movements (EPM).

Fully automated exchanges have increased the number of transactions in the market and enabled intermediaries to expand their use of technology. Trading behind software based on the ability to process and react quickly to the trading flow of data, to the flow of market information has made it possible to carry out a large number of trading in a short time. However, there are still problems with speed in financial markets, using data to invest or to enter trades quickly. For this reason, there is a speed race in which the fastest actors compete with each other in financial markets. Thus, the markets are now dominated by computer algorithms, not by buying and selling tradings made by humans.

HFT; It is a type of trading that can be defined with thousands of order submissions, high order cancellations and intraday marginal profit target, keeping positions in seconds or milliseconds in fractions of a second. Briefly; “HFT leverages the technological capability of sending large numbers of orders at low millisecond delays” (Ersan and Ekinci, 2016).

As HFT algorithms compete with each other, they face two challenges:

The dominant role played by HFT firms in providing liquidity and price arbitrage in the market directly affects the market share of trading areas. Its participation in arbitrages is obtained from the price differences of the shares traded in more than one market, by making profit from small price differences with short-term trading and high volumes.

BISTECH platform, which was put into use in the Equity Market on November 30, 2015, allowed the inclusion of HFT. HFT share in Borsa Istanbul is not as developed as the USA and European countries.

The difference between HFT and AT (Algorithmic Trading); AT is more comprehensive and HFT is a subset of AT. AT is a structure that provides trading with conditions defined in the computer environment, while HFT evaluates the opportunities that occur in very small seconds. In short, HFT is a variant of AT.

Stocks listed in BIST30 index.

16-month period from December 2015 to March 2017, 339 trading days.

Obtaining the daily versions of the variables obtained from the intraday order-trading books from the study of Ekinci and Ersan (2020).

Variables in the study; HFT total, HFT trading sides, HFT trading difference (HFT imbalance). Control variables; trading volume, liquidity and volatility. The dummy variables, which are the main independent variables, are; extreme positive price movements (shares and days with returns of 2% and more than 5%), and extreme negative price movements (stocks and days with returns below -2% and -5%).

I analyze in three stages.

Based on stock and return range

T-test

Regression analysis

Calculating the HFT

First of all, some orders are marked as HFT orders in the data. We do this in two stages.

First, orders with at least two messages (order submission, change, cancellation) in 1 second or less.

Latter, we determine it as any message of the HFT orders determined in the 1st stage, as other orders of the same stock that come in the same second and in the same size.

\(HFT\ ratio_{i,t}=\frac{Electronic\ message_{i,t}^{HFT}}{Electronic\ message_{all^{HFT}}}\)

\(Liquidity_{i,t}= \sum_{j=1}^{N} Volume_{i,j,t}*(\frac{Duration_{i,j,t}}{Duration_{total}})\)

Analyzes

Statistics of Variables in Daily Return Ranges

table1 <- data %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > -0.05 & return <= -0.02 ~ "between -5% and -2%",
      return > -0.02 & return <= 0.02 ~ "between -2% and 2%",
      return > 0.02 & return <= 0.05 ~ "between 2% and 5%",
      return > 0.05 ~ "above 5%"
    )
  ) %>% 
  group_by(return_type) %>% 
  summarise(
    avg_volatility = mean(volatility),
    sd_volatility = sd(volatility),
    avg_liquidity = mean(liquidity),
    sd_liquidity = sd(liquidity),
    avg_hft_buy = mean(hft_buy),
    sd_hft_buy = sd(hft_buy),
    avg_hft_sell = mean(hft_sell),
    sd_hft_sell = sd(hft_sell),
    avg_hft = mean(hft),
    sd_hft = sd(hft),
    avg_return = mean(return),
    sd_return = sd(return),
    avg_bist30_return = mean(bist30_return),
    sd_bist30_return = sd(bist30_return),
    avg_extra_ret_market = mean(extra_ret_market),
    sd_extra_ret_market = sd(extra_ret_market),
    avg_volume = mean(volume),
    sd_volume = sd(volume)
  ) %>% 
  mutate(
    return_type = factor(
      return_type, levels = c("below -5%",
                              "between -5% and -2%",
                              "between -2% and 2%",
                              "between 2% and 5%",
                              "above 5%")
    )
  ) %>% 
  arrange(return_type) %>% 
  mutate_at(vars(-c(return_type, avg_liquidity, sd_liquidity, avg_volume, sd_volume)),
            .funs = function(x) round(x, digits = 4)) %>% 
  t() %>% 
  as.data.frame() %>% 
  `colnames<-`(.[1,]) %>% 
  slice(-1)

kable(table1) %>% 
  kable_paper("hover", full_width = F)
below -5% between -5% and -2% between -2% and 2% between 2% and 5% above 5%
avg_volatility 0.0829 0.0391 0.0218 0.0350 0.0765
sd_volatility 0.0416 0.0135 0.0107 0.0141 0.0320
avg_liquidity 41445971 30858481 27199639 29163226 30176844
sd_liquidity 61709828 48518623 36877463 34473672 35416041
avg_hft_buy 0.0336 0.0454 0.0551 0.0547 0.0518
sd_hft_buy 0.0354 0.0548 0.0625 0.0583 0.0525
avg_hft_sell 0.0653 0.0566 0.0603 0.0476 0.0401
sd_hft_sell 0.0608 0.0600 0.0680 0.0537 0.0456
avg_hft 0.0501 0.0531 0.0605 0.0535 0.0487
sd_hft 0.0415 0.0510 0.0592 0.0506 0.0443
avg_return -0.0700 -0.0288 -0.0001 0.0292 0.0710
sd_return 0.0236 0.0073 0.0101 0.0074 0.0283
avg_bist30_return -0.0278 -0.0132 0.0007 0.0122 0.0156
sd_bist30_return 0.0231 0.0130 0.0097 0.0124 0.0157
avg_extra_ret_market -0.0423 -0.0156 -0.0007 0.0170 0.0554
sd_extra_ret_market 0.0270 0.0125 0.0100 0.0131 0.0352
avg_volume 33566193 21242601 14819716 20296933 31332308
sd_volume 51693115 32522232 23687389 27891523 45142746

Considering the liquidity, the average and standard deviation below -5% belong to non-return stocks and days. The average market volume was experienced most intensely on non-return days.

Number of Days Above 2% and Below -2% of Returns on a Stock Basis

graph1_1 <- data %>% 
  mutate(
    return_type=case_when(
      return < -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  group_by(stock, return_type) %>% 
  summarise(number_of_days = n()) %>% 
  na.omit() %>% 
  pivot_wider(names_from = "return_type", values_from = "number_of_days")

graph1_2 <- data %>% 
  mutate(
    return_type=case_when(
      return < -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  group_by(stock, return_type) %>% 
  summarise(number_of_days = n()) %>% 
  na.omit()

ggplot(graph1_1, aes(y = stock)) + 
  geom_point(data = graph1_2,
             aes(x = number_of_days, color = return_type), size = 3) +
  scale_color_manual(values = c("orange","#76a5af")) +
  geom_dumbbell(aes(x = `below -2%`, xend = `above 2%`),
                size_x = 5, 
                size_xend = 5,
                colour_x = "#76a5af",
                colour_xend = "orange") +
  theme_minimal() +
  theme(axis.title = element_blank(),
        legend.title = element_blank(),
        legend.position = "top",
        plot.title = element_text(face = "bold", hjust = 0.5),
        axis.text = element_text(size = 12, face = "bold")) +
  guides(color = guide_legend(reverse=T)) +
  labs(title = "The Number of Days of Returns")

The stock with the highest number of days with a return of more than 5% is KOZAL, and the stocks with the least number of days; ENKAI, KCHOL, PETKM, SAHOL, TTKOM and TUPRS.

Number of Days Above 5% and Under -5% of Returns on a Stock Basis

graph2_1 <- data %>% 
  mutate(
    return_type=case_when(
      return < -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%"
    )
  ) %>% 
  group_by(stock, return_type) %>% 
  summarise(number_of_days = n()) %>% 
  na.omit() %>% 
  pivot_wider(names_from = "return_type", values_from = "number_of_days")

graph2_2 <- data %>% 
  mutate(
    return_type=case_when(
      return < -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%"
    )
  ) %>% 
  group_by(stock, return_type) %>% 
  summarise(number_of_days = n()) %>% 
  na.omit()

ggplot(graph2_1, aes(y = stock)) + 
  geom_point(data = graph2_2,
             aes(x = number_of_days, color = return_type), size = 3) +
  scale_color_manual(values = c("#80dead","#f1cbff")) +
  geom_dumbbell(aes(x = `below -5%`, xend = `above 5%`),
                size_x = 5, 
                size_xend = 5,
                colour_x = "#f1cbff",
                colour_xend = "#80dead") +
  theme_minimal() +
  theme(axis.title = element_blank(),
        legend.title = element_blank(),
        legend.position = "top",
        plot.title = element_text(face = "bold", hjust = 0.5),
        axis.text = element_text(size = 12, face = "bold")) +
  guides(color = guide_legend(reverse=T)) +
  labs(title = "The Number of Days of Returns")

Here, we determined the HFTs within the return limits we chose throughout our data and averaged them on a share basis. The stock with the highest return under -5% HFT average is TTKOM.

Monthly Average HFT Rate and Extreme Price Moving Days and Number of Stocks

graph3 <- data %>%
  select(day, return, hft) %>% 
  mutate(day = dmy(day), 
         month = month(day),
         year = year(day),
         return_type=case_when(
           return < -0.02 ~ "below -2%",
           return > 0.02 ~ "above 2%"
    )) %>% 
  na.omit() %>% 
  group_by(month, year) %>% 
  summarise("Extreme Price Movements" = n(),
            "High Frequency Trading" = mean(hft)) %>% 
  mutate(day = as.Date(paste0(year, "-", month, "-", 1))) %>% 
  arrange(day) %>% 
  ungroup() %>% 
  select(day, `Extreme Price Movements`, `High Frequency Trading`) %>% 
  pivot_longer(!day, names_to = "types", values_to = "value")

ggplot(graph3, aes(x = day, y = value, group = types, color = types)) +
  geom_line() +
  theme_minimal() + 
  theme(axis.title = element_blank(), 
        legend.position = "none") +
  facet_wrap(~types, scales = "free_y", ncol = 1)

In the graph, we see the relationship of EPM with values on the left and HFT with ratios on the right. Our tipping rate in EPM is minus and plus 2%. The EPM values are the sum of the days within these two extreme prices in the return. HFT rates are obtained by dividing the sum of HFTs on a monthly basis by the number of days in that month. It is seen that the number of EPM days is decreasing and the HFT ratio is increasing. The period with the least extreme price movement; October 2016, the highest is December 2015. The HFT rate reached its highest level in March 2017.

HFT Activity in Daily Return Ranges

table2 <- data %>% 
  select(hft, hft_buy, hft_sell, return) %>% 
  mutate(hft_buy_sell_diff = hft_buy - hft_sell,
         hft_inequilibrium = (hft_buy - hft_sell)/hft, 
         return_type=case_when(
           return <= -0.05 ~ "below -5%",
           return > -0.05 & return <= -0.02 ~ "between -5% and -2%",
           return > -0.02 & return <= -0.005 ~ "between -2% and -0.5%",
           return > -0.005 & return <= 0.005 ~ "between -0.5% and 0.5%",
           return > 0.005 & return <= 0.02 ~ "between 0.5% and 2%",
           return > 0.02 & return <= 0.05 ~ "between 2% and 5%",
           return > 0.05 ~ "above 5%"
    )) %>% 
  group_by(return_type) %>% 
  summarise(
    avg_hft = mean(hft),
    avg_hft_buy = mean(hft_buy),
    avg_hft_sell = mean(hft_sell),
    avg_hft_buy_sell_diff = mean(hft_buy_sell_diff),
    avg_hft_inequilibrium = mean(hft_inequilibrium, na.rm = T)
  ) %>% 
  mutate(
    return_type = factor(
      return_type, levels = c("below -5%",
                              "between -5% and -2%",
                              "between -2% and -0.5%",
                              "between -0.5% and 0.5%",
                              "between 0.5% and 2%",
                              "between 2% and 5%",
                              "above 5%")
    )
  ) %>% 
  arrange(return_type) %>% 
  mutate_at(vars(-c(return_type)), .funs = function(x) round(x, digits = 4)) %>% 
  t() %>% 
  as.data.frame() %>% 
  `colnames<-`(.[1,]) %>% 
  slice(-1)

kable(table2) %>% 
  kable_paper("hover", full_width = F)
below -5% between -5% and -2% between -2% and -0.5% between -0.5% and 0.5% between 0.5% and 2% between 2% and 5% above 5%
avg_hft 0.0501 0.0531 0.0596 0.0609 0.0610 0.0535 0.0487
avg_hft_buy 0.0336 0.0454 0.0495 0.0557 0.0602 0.0547 0.0518
avg_hft_sell 0.0653 0.0566 0.0641 0.0603 0.0563 0.0476 0.0401
avg_hft_buy_sell_diff -0.0317 -0.0112 -0.0146 -0.0046 0.0039 0.0071 0.0117
avg_hft_inequilibrium -0.6879 -0.2948 -0.2616 -0.0664 0.0768 0.1529 0.2846

In the table, there are the averages of the HFT variables of the days that fall into different return ranges. When approaching from minus 5% to zero, the HFT and HFT purchase values decrease and grow after zero. In the averages of the HFT buying and selling difference, while it approaches zero, it shrinks and in the case of positive EPMs, the averages increase. In the average of the ratio of the HFT difference to the HFT, this average value grows as the stocks and days with an EPM below -5% get closer to the stocks and days with a positive EPM.

HFT Activity on Extreme Price Moving Days and Stocks

table3 <- data %>% 
   select(hft, hft_buy, hft_sell, return) %>% 
  mutate(hft_buy_sell_diff = hft_buy - hft_sell,
         hft_inequilibrium = (hft_buy - hft_sell)/hft, 
         return_type=case_when(
           return <= -0.05 ~ "below -5%",
           return > -0.05 & return <= -0.02 ~ "between -5% and -2%",
           return > -0.02 & return <= -0.005 ~ "between -2% and -0.5%",
           return > -0.005 & return <= 0.005 ~ "between -0.5% and 0.5%",
           return > 0.005 & return <= 0.02 ~ "between 0.5% and 2%",
           return > 0.02 & return <= 0.05 ~ "between 2% and 5%",
           return > 0.05 ~ "above 5%"
    )) %>% 
  group_by(return_type) %>% 
  summarise(
    avg_hft = mean(hft),
    avg_hft_buy = mean(hft_buy),
    avg_hft_sell = mean(hft_sell),
    avg_hft_buy_sell_diff = mean(hft_buy_sell_diff),
    avg_hft_inequilibrium = mean(hft_inequilibrium, na.rm = T),
    p_hft = t.test(hft)$p.value,
    p_hft_buy = t.test(hft_buy)$p.value,
    p_hft_sell = t.test(hft_sell)$p.value,
    p_hft_buy_sell_diff = t.test(hft_buy_sell_diff)$p.value,
    p_hft_inequilibrium = t.test(hft_inequilibrium)$p.value
  ) %>% 
  mutate(
    return_type = factor(
      return_type, levels = c("below -5%",
                              "between -5% and -2%",
                              "between -2% and -0.5%",
                              "between -0.5% and 0.5%",
                              "between 0.5% and 2%",
                              "between 2% and 5%",
                              "above 5%")
    )
  ) %>% 
  arrange(return_type) %>% 
  select(return_type, avg_hft, p_hft, avg_hft_buy, p_hft_buy, avg_hft_sell,
         p_hft_sell, avg_hft_buy_sell_diff, p_hft_buy_sell_diff, avg_hft_inequilibrium, p_hft_inequilibrium) %>% 
  mutate_at(vars(-c(return_type)), .funs = function(x) round(x, digits = 4)) %>% 
  mutate_if(is.character, as.numeric) %>% 
  t() %>% 
  as.data.frame() %>% 
  `colnames<-`(.[1,]) %>% 
  slice(-1) %>% 
  mutate_if(is.character, as.numeric) %>% 
  mutate_all(.funs = function(x) x - .[,4])
  
kable(table3) %>% 
  kable_paper("hover", full_width = F)
below -5% between -5% and -2% between -2% and -0.5% between -0.5% and 0.5% between 0.5% and 2% between 2% and 5% above 5%
avg_hft -0.0108 -0.0078 -0.0013 0 0.0001 -0.0074 -0.0122
p_hft 0.0000 0.0000 0.0000 0 0.0000 0.0000 0.0000
avg_hft_buy -0.0221 -0.0103 -0.0062 0 0.0045 -0.0010 -0.0039
p_hft_buy 0.0000 0.0000 0.0000 0 0.0000 0.0000 0.0000
avg_hft_sell 0.0050 -0.0037 0.0038 0 -0.0040 -0.0127 -0.0202
p_hft_sell 0.0000 0.0000 0.0000 0 0.0000 0.0000 0.0000
avg_hft_buy_sell_diff -0.0271 -0.0066 -0.0100 0 0.0085 0.0117 0.0163
p_hft_buy_sell_diff -0.0020 -0.0020 -0.0020 0 0.0034 -0.0019 0.0097
avg_hft_inequilibrium -0.6215 -0.2284 -0.1952 0 0.1432 0.2193 0.3510
p_hft_inequilibrium -0.0005 -0.0005 -0.0005 0 -0.0005 -0.0005 0.0004

The table shows that HFT activity reduces taking a buy position during negative-end price movements, while it reduces selling-positioning during positive-end price movements.

It was 0 (zero) because the p values were too small.

Extreme price movements and HFT relationship – Regression models

\(HFT_{i,t}= EPM_{i,t}^{+0.05} + EPM_{i,t}^{-0.05} + \epsilon_{i,t}\)

\(HFT_{i,t}= EPM_{i,t}^{+0.02} + EPM_{i,t}^{-0.02} + \epsilon_{i,t}\)

\(HFT_{i,t}= EPM_{i,t}^{+0.05} + EPM_{i,t}^{-0.05} + Volume_{i,t} + Liquidity_{i,t} + Volatility_{i,t} + \epsilon_{i,t}\)

\(HFT_{i,t}= EPM_{i,t}^{+0.02} + EPM_{i,t}^{-0.02} + Volume_{i,t} + Liquidity_{i,t} + Volatility_{i,t} + \epsilon_{i,t}\)

Total HFT Ratio Determinants

data_all_1 <- data %>% 
  select(hft, return) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft = (hft - mean(hft)) / sd(hft)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_all_1 <- lm(hft ~ `return_type_above 5%` + `return_type_below -5%`, data = data_all_1)
stargazer(model_all_1, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                    hft            
---------------------------------------------------
`return_type_above 5%`           -0.179**          
                                  (0.084)          
                                                   
`return_type_below -5%`           -0.154*          
                                  (0.093)          
                                                   
Constant                           0.004           
                                  (0.010)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.001           
Adjusted R2                        0.001           
Residual Std. Error         1.000 (df = 10167)     
F Statistic               3.589** (df = 2; 10167)  
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_all_2 <- lm(hft ~ `return_type_above 2%` + `return_type_below -2%`, data = data_all_1)
stargazer(model_all_2, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                    hft            
---------------------------------------------------
`return_type_above 2%`           -0.115***         
                                  (0.031)          
                                                   
`return_type_below -2%`          -0.123***         
                                  (0.034)          
                                                   
Constant                          0.025**          
                                  (0.011)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.002           
Adjusted R2                        0.002           
Residual Std. Error         0.999 (df = 10167)     
F Statistic              12.039*** (df = 2; 10167) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
data_all_2 <- data %>% 
  select(hft, return, volume, liquidity, volatility) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft = (hft - mean(hft)) / sd(hft),
         volume = (volume - mean(volume)) / sd(volume),
         liquidity =(liquidity - mean(liquidity)) / sd(liquidity)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_all_3 <- lm(hft ~ `return_type_above 5%` + `return_type_below -5%` +
                    volume + liquidity + volatility, data = data_all_2)
stargazer(model_all_3, type = "text") 

===================================================
                            Dependent variable:    
                        ---------------------------
                                    hft            
---------------------------------------------------
`return_type_above 5%`            0.159*           
                                  (0.088)          
                                                   
`return_type_below -5%`           0.230**          
                                  (0.098)          
                                                   
volume                           -0.246***         
                                  (0.014)          
                                                   
liquidity                         -0.001           
                                  (0.014)          
                                                   
volatility                       -3.811***         
                                  (0.687)          
                                                   
Constant                         0.096***          
                                  (0.020)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.067           
Adjusted R2                        0.067           
Residual Std. Error         0.966 (df = 10164)     
F Statistic             146.079*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_all_4 <- lm(hft ~ `return_type_above 2%` + `return_type_below -2%` + 
                    volume + liquidity + volatility, data = data_all_2)
stargazer(model_all_4, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                    hft            
---------------------------------------------------
`return_type_above 2%`            -0.040           
                                  (0.031)          
                                                   
`return_type_below -2%`           -0.029           
                                  (0.034)          
                                                   
volume                           -0.245***         
                                  (0.014)          
                                                   
liquidity                         -0.001           
                                  (0.014)          
                                                   
volatility                       -2.594***         
                                  (0.624)          
                                                   
Constant                         0.076***          
                                  (0.018)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.067           
Adjusted R2                        0.066           
Residual Std. Error         0.966 (df = 10164)     
F Statistic             144.902*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01

Buy-side HFT Ratio Determinants

data_buy_1 <- data %>% 
  select(hft_buy, return) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft_buy = (hft_buy - mean(hft_buy)) / sd(hft_buy)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_buy_1 <- lm(hft_buy ~ `return_type_above 5%` + `return_type_below -5%`, data = data_buy_1)
stargazer(model_buy_1, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                  hft_buy          
---------------------------------------------------
`return_type_above 5%`            -0.037           
                                  (0.084)          
                                                   
`return_type_below -5%`          -0.336***         
                                  (0.093)          
                                                   
Constant                           0.004           
                                  (0.010)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.001           
Adjusted R2                        0.001           
Residual Std. Error         0.999 (df = 10167)     
F Statistic              6.599*** (df = 2; 10167)  
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_buy_2 <- lm(hft_buy ~ `return_type_above 2%` + `return_type_below -2%`, data = data_buy_1)
stargazer(model_buy_2, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                  hft_buy          
---------------------------------------------------
`return_type_above 2%`            -0.001           
                                  (0.031)          
                                                   
`return_type_below -2%`          -0.153***         
                                  (0.034)          
                                                   
Constant                           0.015           
                                  (0.011)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.002           
Adjusted R2                        0.002           
Residual Std. Error         0.999 (df = 10167)     
F Statistic              10.261*** (df = 2; 10167) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
data_buy_2 <- data %>% 
  select(hft_buy, return, volume, liquidity, volatility) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft_buy = (hft_buy - mean(hft_buy)) / sd(hft_buy),
         volume = (volume - mean(volume)) / sd(volume),
         liquidity =(liquidity - mean(liquidity)) / sd(liquidity)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_buy_3 <- lm(hft_buy ~ `return_type_above 5%` + `return_type_below -5%` 
                  + volume + liquidity + volatility, data = data_buy_2)
stargazer(model_buy_3, type = "text") 

===================================================
                            Dependent variable:    
                        ---------------------------
                                  hft_buy          
---------------------------------------------------
`return_type_above 5%`           0.256***          
                                  (0.089)          
                                                   
`return_type_below -5%`           -0.004           
                                  (0.099)          
                                                   
volume                           -0.209***         
                                  (0.014)          
                                                   
liquidity                         0.0004           
                                  (0.014)          
                                                   
volatility                       -3.339***         
                                  (0.693)          
                                                   
Constant                         0.085***          
                                  (0.020)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.049           
Adjusted R2                        0.049           
Residual Std. Error         0.975 (df = 10164)     
F Statistic             105.283*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_buy_4 <- lm(hft_buy ~ `return_type_above 2%` + `return_type_below -2%` 
                  + volume + liquidity + volatility, data = data_buy_2)
stargazer(model_buy_4, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                  hft_buy          
---------------------------------------------------
`return_type_above 2%`            0.069**          
                                  (0.031)          
                                                   
`return_type_below -2%`           -0.066*          
                                  (0.035)          
                                                   
volume                           -0.209***         
                                  (0.014)          
                                                   
liquidity                          0.001           
                                  (0.014)          
                                                   
volatility                       -2.658***         
                                  (0.630)          
                                                   
Constant                         0.068***          
                                  (0.019)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.049           
Adjusted R2                        0.049           
Residual Std. Error         0.975 (df = 10164)     
F Statistic             105.651*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01

Sell Side HFT Ratio Determinants

data_sell_1 <- data %>% 
  select(hft_sell, return) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft_sell = (hft_sell - mean(hft_sell)) / sd(hft_sell)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_sell_1 <- lm(hft_sell ~ `return_type_above 5%` + `return_type_below -5%`, data = data_sell_1)
stargazer(model_sell_1, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                 hft_sell          
---------------------------------------------------
`return_type_above 5%`           -0.278***         
                                  (0.084)          
                                                   
`return_type_below -5%`            0.106           
                                  (0.093)          
                                                   
Constant                           0.003           
                                  (0.010)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.001           
Adjusted R2                        0.001           
Residual Std. Error         0.999 (df = 10167)     
F Statistic              6.165*** (df = 2; 10167)  
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_sell_2 <- lm(hft_sell ~ `return_type_above 2%` + `return_type_below -2%`, data = data_sell_1)
stargazer(model_sell_2, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                 hft_sell          
---------------------------------------------------
`return_type_above 2%`           -0.189***         
                                  (0.031)          
                                                   
`return_type_below -2%`           -0.052           
                                  (0.034)          
                                                   
Constant                          0.027**          
                                  (0.011)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.004           
Adjusted R2                        0.004           
Residual Std. Error         0.998 (df = 10167)     
F Statistic              19.030*** (df = 2; 10167) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
data_sell_2 <- data %>% 
  select(hft_sell, return, volume, liquidity, volatility) %>% 
  mutate(
    return_type=case_when(
      return <= -0.05 ~ "below -5%",
      return > 0.05 ~ "above 5%",
      return <= -0.02 ~ "below -2%",
      return > 0.02 ~ "above 2%"
    )
  ) %>% 
  mutate(hft_sell = (hft_sell - mean(hft_sell)) / sd(hft_sell),
         volume = (volume - mean(volume)) / sd(volume),
         liquidity =(liquidity - mean(liquidity)) / sd(liquidity)) %>% 
  fastDummies::dummy_cols(.) %>% 
  mutate_if(is.numeric, ~replace_na(., 0))

model_sell_3 <- lm(hft_sell ~ `return_type_above 5%` + `return_type_below -5%` + 
                     volume + liquidity + volatility, data = data_sell_2)
stargazer(model_sell_3, type = "text") 

===================================================
                            Dependent variable:    
                        ---------------------------
                                 hft_sell          
---------------------------------------------------
`return_type_above 5%`            -0.020           
                                  (0.089)          
                                                   
`return_type_below -5%`          0.397***          
                                  (0.099)          
                                                   
volume                           -0.222***         
                                  (0.014)          
                                                   
liquidity                          0.007           
                                  (0.014)          
                                                   
volatility                       -2.537***         
                                  (0.693)          
                                                   
Constant                         0.063***          
                                  (0.020)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.051           
Adjusted R2                        0.051           
Residual Std. Error         0.974 (df = 10164)     
F Statistic             109.717*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
model_sell_4 <- lm(hft_sell ~ `return_type_above 2%` + `return_type_below -2%` + 
                     volume + liquidity + volatility, data = data_sell_2)
stargazer(model_sell_4, type = "text")

===================================================
                            Dependent variable:    
                        ---------------------------
                                 hft_sell          
---------------------------------------------------
`return_type_above 2%`           -0.134***         
                                  (0.031)          
                                                   
`return_type_below -2%`            0.016           
                                  (0.035)          
                                                   
volume                           -0.222***         
                                  (0.014)          
                                                   
liquidity                          0.008           
                                  (0.014)          
                                                   
volatility                        -1.232*          
                                  (0.629)          
                                                   
Constant                          0.047**          
                                  (0.019)          
                                                   
---------------------------------------------------
Observations                      10,170           
R2                                 0.052           
Adjusted R2                        0.051           
Residual Std. Error         0.974 (df = 10164)     
F Statistic             110.402*** (df = 5; 10164) 
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01

As a result;

Lower HFT activity was observed on stocks and days with extreme price movements.

HFT decreased on the sell side during positive extreme price movements.

HFT decreased on the buy side during negative extreme price movements