✍ ^New ChatGPT and DeepSeek: Can They Predict the Stock Market and Macroeconomy?

with Jian Chen, Guohao Tang and Wu Zhu (current version: Feb., 2025).

We study whether ChatGPT and DeepSeek can extract information from the Wall Street Journal to predict the stock market and the macroeconomy. We find that ChatGPT has predictive power. DeepSeek underperforms ChatGPT, which is trained more extensively in English. Other large language models also underperform. Consistent with financial theories, the predictability is driven by investors’ underreaction to positive news, especially during periods of economic downturn and high information uncertainty. Negative news correlates with returns but lacks predictive value. At present, ChatGPT appears to be the only model capable of capturing economic news that links to the market risk premium.

Presented at 2024 SIF.

Momentum and Factor Momentum: A Re-Examination

with Cheng Gao, Sophia Zhengzi Li and Peixuan Yuan (current version: December, 2024).

(Data and Code generating the main result)

We show that the momentum factor remains a unique and irreplaceable factor, in contrast to the redundancy finding of Ehsani and Linnainmaa (2022), which suffers from an omitted-variable problem. By adding a betting-against-systematic (BAS) factor to their framework, we find that the momentum factor exhibits significant alpha. Further, we demonstrate that even an improved factor model, such as IPCA, cannot explain the momentum unless momentum characteristics are utilized. Moreover, in an attribution analysis, we show that firm-specific components, not non-momentum factors, are the primary drivers of momentum returns.

Anomalies and Links to Market Return Predictability: Supranational Evidence Based on New Market Efficiency Measures

with Xi Dong, Yan Li, Yanran and David Rapach (current version: March, 2025).

We connect cross-sectional anomalies to time-series market return predictability in an international setting for 44 non-US countries. Although a large number of representative anomaly returns exhibit limited predictive ability for market returns at the country level, they evince strong evidence when aggregated to the supranational level. After supranational aggregation, long-short anomaly portfolio returns predict developed market returns, while long- or short-leg anomaly returns predict emerging market returns; furthermore, characteristics themselves, on which anomalies are based, become stronger market return predictors after supranational aggregation. We develop a decomposition of anomaly-market links into three new market efficiency measures of broad interest and show that they explain the predictability patterns in the data.k

Risk Momentum: A New Class of Price Patterns

with Sophia Zhengzi Li and Peixuan Yuan (current version: October, 2024).

We uncover a new pattern: the stock risk component exhibits momentum. This risk momentum yields a return momentum: stocks sorted by risk have persistent positive returns. In comparison with the extremely popular and extensively studied Jegadeesh and Titman (1993) momentum sorted by return, which is valid only monthly and only for stocks, our risk-based return momentum holds intraday, daily, weekly, and monthly, and exists for not only stocks, but also for corporate bonds and other asset classes. Furthermore, our risk momentum, the strongest ever discovered, is different from the factor momentum of Ehsani and Linnainmaa (2022) sorted by factor performance.

Intraday Option Reversals: Return Predictability and Market Efficiency

with Heiner Beckmeyer, Ilias Filippou and Zhaoque (Chosen) Zhou (current version: Jan., 2025).

We find the first option reversal patterns intraday: returns reverse half-hourly during the trading day. The reversals are both economically and statistically significant and are robust to transaction costs and various controls, such as implied volatility changes and market frictions. The reversals are unrelated to cross-day momentum. Additionally, we provide an option-demand theoretical framework to explain the patterns. Our findings suggest that intraday demand pressures are important for asset pricing intraday, which drives the reversals and has profound implications for market efficiency.

Which Factors Matter in the Pricing Kernel?

with Bin Luo and Ti Zhou (current version: Jan., 2025).

We propose a general framework for selecting factors that provide independent information for mean-variance efficiency, with the estimated portfolio converging to the theoretically efficient portfolio at the fastest rate to date. When applied to 174 factors, we find that EarningsPredictability and the market are the most important contributors, followed by intangibles, momentum, and investment factors. The out-of-sample Sharpe ratio is 3.59, outperforming alternative methods. Our approach also introduces a novel model comparison test, rejecting eight well-known factor models. After adjusting for publication bias, the Sharpe ratio remains 2.47, while a real-time three-factor model achieves 1.21, outperforming the Fama-French model (0.54).

Leading Stocks and the Stock Market Expected Returns

with Zhuo Chen, Xianfeng Hao and Honghai Yu (current version: Feb., 2025).

We identify leading stocks using a machine learning method, and find that the negative leaders, which lead other stocks negatively, have a strong predictive power on the future stock market returns both in- and out-of-sample, whereas the positive leaders do not. The predictability generates significant economic value to a mean-variance investor in asset allocation. Economically, underreaction of the followers of the negative leaders appears the driving force for the predictability. Our study provides the first empirical evidence that bridges the lead-lag literature to the literature on the predictability of the market risk premium.

Fear in the "Fearless" Treasury Market

with Tianyang Wang, Yuanzhi Wang and Qunzi Zhang (current version: Nov., 2024).

This paper examines how fear affects the Treasury market and predicts Treasury bond returns. Using a text-based fear index from social and news media, we find that fear significantly predicts future Treasury returns, both in-sample and out-of-sample, and suggests the global transmission of fear. We also propose a model explaining that risk aversion shocks drive bond risk premia. Our paper further explores various dimensions of fear effects, such as term, magnitude, dynamics, and sources, and compares them with other sentiments. The results highlight the critical role of fear in Treasury market dynamics.

Optimal Portfolio Choice with Economic Constraints: A Genetic Programming Approach

with Yang Liu (current version: April, 2024).

We develop a new approach to construct the mean-variance efficient portfolio by directly targeting the optimal weight with economic-motivated regularization that incorporates economic constraints to guard against overfitting and enhance interpretability. Instead of struggling with noisy estimators of expected return and covariance matrix, we interpret a portfolio rule as a mapping from historical data to optimal weights and take advantage of the vigorous searching capability of genetic programming (GP) to estimate this weighting function directly. While conventional penalties, such as L1 and L2 norms, are not feasible in our model due to GP's non-parametricity, we propose a trading-frictions-based regularization to control model complexity while preserving interpretability. The out-of-sample Sharpe ratio of our GP approach more than doubles those of existing methods. Beyond portfolio choice, we also derive a model-implied expected return measure from the GP-optimal weight and find that it subsumes the predictability of other machine learning methods in the cross-section of stock returns. Our study highlights the importance of marrying machine learning and economic rationale for interpretable machine learning applications in asset pricing.

Equity Risk Premium Prediction: Return Decomposition and Noise Shrinkage

with Yanyan Lin, Chongfeng Wu and Shunwei Zhu (current version: Nov., 2024).

We propose a novel decomposition of stock returns into a fundamental component (FC) and an unexpected capital gains component (UC). The FC, driven by firm's valuation ratios, reflects long-term growth and exhibits high persistence, while the UC, influenced by market trading prices, reflects short-term fluctuations and is more random. To predict the UC, we use a predictive regression model with an L multiplier to shrink noise for mitigating estimation errors. Among the 41 monthly predictors examined by Goyal, Welch and Zafirov (2024), we find 33 of them significantly outperform the historical average forecast, compared to only 5 with their method. Aggregating information across the predictors, we reaffirms the predictability of the equity risk premium.

ETFs, Anomalies and Market Efficiency

with Ilias Filippou, Songrun He and Sophia Zhengzi Li (current version: April, 2024).

We construct a stock-level composite mispricing score CZ Net based on over 200 anomalies. We find that a long-short CZ Net portfolio formed by low ETF ownership stocks yields higher returns, greater Sharpe ratios, and more significant alphas compared to the portfolio formed by high ETF ownership stocks. Furthermore, low ETF ownership stocks exhibit greater price delay and lower information efficiency. These findings remain robust after controlling for characteristics related to short-sale constraints, arbitrage costs, and the information environment. Using Russell index reconstitution as a natural experiment, we provide additional causal evidence of ETF ownership attenuating anomaly profits.

Presented SGF 2023, and WFA 2023; would have been presented at TAU Finance Conference 2023.

Unusual Financial Communication: ChatGPT, Earnings Calls, and Financial Markets

with Lars Beckmann, Heiner Beckmeyer, Ilias Filippou and Stefan Menze (current version: Feb, 2025).

We devise a prompting strategy for ChatGPT to detect and analyze unusual aspects of financial communication in earnings calls. We identify 25 dimensions across four categories: unusual communication styles by executives and analysts, unusual contents, and technical difficulties. Unusual financial communication is common, correlates with certain firm characteristics and fluctuates with the business cycle. Financial markets react to both aspects – unusual communication styles and unusual contents – with a negative stock return, elevated trading activity, higher volatility and option-implied uncertainty, and downward revisions of next-quarter earnings forecasts by analysts. Our study demonstrates the potential of large language models to provide new insights into the interpretation of financial textual data.

Finalist for Crowell Memorial Prize; Presented at CICF 2024.

What Drives the Earnings Announcement Risk?

with Hong Liu, Yingdong Mao and Xiaoxiao Tang (current version: July, 2024).

We provide the first estimates of the ex-ante risk premia on earnings announcements using the forward-looking information in the options market. We find that the average earnings announcement risk premium is highly significant at 16 basis points, with substantial variation across firms and across time. Sorting by the ex-ante estimated risk premia generates a daily return spread of 40 bps between high and low terciles. Moreover, the ex-ante estimated risk premia provide new insights on what drives the well-documented positive post-earnings-announcement drift and yield profitable straddle strategies

A New Option Momentum: Compensation for Risk

with Heiner Beckmeyer and Ilias Filippou (current version: June, 2024).

This paper introduces a novel momentum strategy in the options market based on the systematic component of option returns. Utilizing a latent factor model to decompose options returns, we demonstrate that the systematic component exhibits stronger momentum and subsumes the performance of conventional return-based momentum. With a six-month formation and one-month holding period, the strategy achieves an annualized Sharpe ratio of 2.23, compared to 1.08 for traditional momentum, and is highly profitable for various formation and holding periods. The superior performance is driven by time-varying risk compensation rather than investor biases, underscoring the economic rationale behind its success

Best paper awards, INQUIRE UK/Europe, 2024 and FMA Asset Mgt Consortium at Cambridge, 2024; Presented at EFA, 2024.

✍ ^New Option Expected Hedging Demand

with Xiaoxiao Tang and Zhaoque (Chosen) Zhou (current version: Feb., 2024).

Options market makers' delta hedging has an increasing impact on underlying stock prices as both the option volume and the ratio of option volume to stock volume grow drastically in recent years. We introduce a novel approach utilizing real-time option information to calculate the spot elasticity of delta (ED) and expected hedging demand (EHD), and find that the EHD significantly predicts future stock returns in the cross section. The positive impact of EHD on stock prices lasts up to five trading days, and then a reversal follows. The empirical evidence of heterogeneous EHD-return relationship, influenced by ED, leads to varied option market maker behaviors, and is consistent with conventional economic theory. Moreover, we find that EHD has a little correlation with other popular firm characteristics, representing a new risk that is not captured by conventional factor models.

Seeing is Believing: Annual Report 'Graphicity' and Stock Returns Predictability

with Xiahu Deng, Lei Gao and Bo Hu (current version: Sept., 2023).

Why do firms graphically enhance their annual reports that appear redundant to the 10-Ks? We develop a novel rational model to explain this. Using a large dataset, we report the first evidence that firms earn approximately 3.5% abnormal returns in the next 3 to 6 months after they initiate graphic annual reports. This is accompanied by an increase in institutional investors' holdings, consistent with our theory that firms create visuals to overcome investor inattention and help communicate subtle information to fundamental investors. This is also consistent with the fact that such firms tend to increase their R&D investments afterwards.

Presented at 2024 EFA.

Market Risk Premium Expectation: Combining Option Theory with Traditional Predictors

with Hong Liu, Yueliang (Jacques) Lu and Weike Xu (current version: Oct., 2024).

We extend the Martin (2017) option bound by incorporating economic state variables, linking option-based bounds to the traditional predictability literature. Our state-dependent bounds (SDBs) significantly improve out-of-sample predictions of the market risk premium, outperforming models that rely solely on either option prices or traditional stock market predictors. Moreover, SDBs substantially increase portfolio Sharpe ratios and enhance investor utility. In a cross-sectional analysis of expected stock returns, we show that option-based information provides incremental value beyond conventional firm characteristics. Our novel findings highlight the importance of integrating information in both option prices and economic state variables.

Presented at AFA 2024.

Market Risk Premium: Best Linear Predictor in High Dimension

with Fuwei Jiang, Kunpeng Li and Guoshi Tong (current version: Oct., 2024).

In the age of big data, PCA and PLS are widely used in finance for dimension deduction to identify a few predictive factors. In this paper, we make a surprising discovery that the dimension deduction can achieve the optimal lower bound of one in an equivalent model. We propose a supervised learning method to find the optimal predictive factor, which is the best linear combination of a large set of predictors. Our approach outperforms alternative dimension reduction techniques, such as PCA and PLS, theoretically. Just as an efficient portfolio highlights which assets are crucial in the pricing kernel, our optimal predictive factor pinpoints the most significant combination of predictors in forecasting. When applied to predicting the market risk premium, our method outperforms empirically not only both the PCA and PLS, but also all those state of art machine learning methods use by Dong, Li, Rapach, and Zhou (2022). Furthermore, our method reveals a set of novel predictors. Additionally, we identify the optimal predictive factors for marker volatility, bond excess return and macroeconomic aggregate, and find our method continues to perform the best.

Maximizing the Sharpe Ratio: A Genetic Programming Approach

with Yang Liu and Yingzi Zhu (current version: Feb., 2025).

(On-line Appendix)

While existing studies focus on minimizing model fitting errors, we maximize directly the Sharpe ratio of spread portfolios with a genetic programming (GP) approach. We find that the GP approach can double the performance in the US and outperform internationally, compared with other approaches under examination. We also apply the GP to maximize the Sharpe ratio of investing in all the underlying stocks, which amounts to searching for the stochastic discount factor that prices all the assets. We find that the Sharpe ratio is 75% greater than before, indicating the loss of relying on spread portfolios for investing and pricing can be substantial.

Presented at AFA 2024, and at 2021 CICF.

Which Expectation?

with Juhani T. Linnainmaa and Yingguang (Conson) Zhang (current version: Dec., 2023).

We test a theory of two expectations in asset pricing: investors separately form beliefs on cash flow level and cash flow growth when valuing assets. Using 123 anomalies and analysts’ earnings term structure forecasts, we find strong evidence for the separability of the two beliefs. Forecast errors in cash flow level and cash flow growth are uncorrelated. Anomaly portfolios typically manifest biases in one belief or the other but not both. Anomalies with large (small) alphas often have the two biases amplifying (offsetting) each other. The first two principal components of anomaly returns are essentially a growth bias factor and a level bias factor. The two biases explain about 50\% of the anomaly portfolios' cross-sectional deviation from the CAPM. Level bias generates large initial alpha and growth bias generates persistent alpha. We also provide an explanation for the recent alpha decay with analysts’ improved forecast accuracy.

Presented at AFA 2024.

Myopic Expectations and Stock Market Mispricing

with Yingguang (Conson) Zhang and Yingzi Zhu (current version: April, 2024).

(On-line Appendix)

Are expectations in financial markets myopic? Based on a new multi-horizon expectation framework and using data of U.S. stock analysts’ forecasts, we find that their forecasts are myopic, and their myopic expectations are associated with large price distortions even in recent periods. Our study distinguishes among different sources of myopic expectations, reconciles myopia with long-horizon belief overreaction, quantifies myopia effects across horizons, tests the role of information frictions, and assesses the economic significance in terms of trading profits. Our framework is generally applicable to other settings with multi-horizon expectations, providing a useful tool for future research.

✍ ^New Fama-MacBeth Regression with Asset Pricing Restriction

with Yuanqi Yang and Yifeng Zhu (current version: May, 2024).

In this paper, we propose a modified Fama-MacBeth regression that incorporates asset pricing restrictions into the estimation. The restrictions require the model to explain both the time series and cross-sectional variations, and also to select factors for sparsity. Solving the estimation via a least angle regression-type algorithm, we find empirically that the new model outperforms existing factor selection methodologies in predicting the cross-sectional stock returns. In addition, we propose new interpretable characteristics-based factors, and our factors outperform classical factors models.

✍ ^New Pockets of Factor Pricing

with Sophia Zhengzi Li and Peixuan Yuan (current version: December, 2023).

Current factor models assume certain pre-specified factors can price or explain asset returns with the same level of ability across time. In contrast with this conventional wisdom, we find that factor's pricing ability exhibit notable temporal variations, and it tends to cluster in certain periods referred to as "pockets." We propose a real-time approach to effectively identify the pockets, and apply it to a comprehensive set of firm characteristics. We find episodic and distinct dynamics of return predictability for different types of characteristics, contradicting the notion of continuous presence of the same factors with the same pricing ability. Exploiting factor's time-varying predictive power, we construct a composite predictor/factor that achieves a value-weighted hedge return of 3.94% per month with a high t-statistic of 13.87. Additionally, the composite factor pricing model, which incorporates a selection of factors with factor timing, demonstrates superior effectiveness in both explaining and predicting market anomalies. The factor also provides a comprehensive explanation for factor momentum, which is shown a consequence of the past performance of factor returns.

✍ ^New Did Retail Traders Take Over Wall Street? A Tick-by-Tick Analysis of GameStop's Price Surge

with Zhaoque (Chosen) Zhou (current version: November, 2024).

GameStop’s stock price unprecedentedly surged by over 2800% in January 2021. Unlike previous studies, we utilize tick-by-tick data of both stock and option trades to show that this dramatic price rise was primarily driven by overnight trading and largely fueled by institutional orders rather than retail activity. Our analysis of option trading further provides evidence of a “gamma squeeze”. Theoretically, we extend the Brunnermeier and Pedersen (2005) model to explain several of our key findings. Overall, we conclude that it is because of the institutional backing that retail investors succeeded in driving the stock price bubble.

Presented at CICF 2024.

Macro Financial Trends and Equity Risk Premium

with Yufeng Han and Yueliang (Jacques) Lu (current version: Feb., 2025).

This paper shows that trends, typically used for monetary policy guidance, are also effective in predicting market excess returns. Using a linear combination method across 14 economic and financial predictor variables, we find that moving-average trends outperform the variables' current values in forecasting market returns. Incorporating neural networks further improves these predictions. Our findings underscore the importance of trends, supporting the Federal Reserve's emphasis on trends over lagged variables. When accounting for nonlinearity, we find that market return predictability is significantly greater than commonly believed. Our results are robust across both U.S. and global equity markets.

✍ ^New Expected Index Option Return: What Can We Learn From Macro and Anomalies

with Heiner Beckmeyer and Guoshi Tong (current version: Jan., 2024).

We provide the first study on whether the expected returns on stock index option is predictable and how, extending the large literature on the predictability of the stock market in a new direction. We find that the stock index option is predictable by common macroeconomic predictors whose predictive power on options is even stronger than on the underlying. We find also that, although stock market inefficiency, as captures by anomalies, explains the future option returns, option market inefficiency plays a greater role. The economic value of incorporating the option predictability versus ignoring it can be substantial.

Presented at CICF 2024.

Bottom Up vs Top Down: What Does Firm 10-K Tell Us?

with Landon Ross, Jim Horn, Mert Pilanci and Kaihong Luo (current version: November, 2024).

In contrast to the recent increasing focus on large languages model, we propose a bottom-up approach that exploits the individual predictive power of each word. Our word dictionary is constructed by using a data-driven approach, and it is these selected words that are used to build the predictive model with lasso regularized regressions and large panels of word counts. We find that our approach effectively estimates the cross-section of stocks' expected returns, so that a factor that summarizes the information generates economically and statistically significant returns, and these returns are largely unexplained by standard factor models. However, an inspection of the factor dictionary indicates the element contains many words with possible risk-related interpretations, such as currency, oil, research, and restructuring, which increase a stock's expected return, while the words acquisition, completed, derivatives, and quality decrease the expected return.

No Sparsity in Asset Pricing: Evidence from a Generic Statistical Test

with Junnan He and Lingxiao Zhao (current version: Feb., 2024).

We provide a generic statistical test to discern whether there is sparsity of in high-dimensional factor models. Applying the test to recent characteristic-based factor models, we find that the null hypothesis of fewer than ten factors capable of explaining the cross-section of stock returns is rejected. Moreover, a dense model representation outperforms sparse models both in pricing in the cross-section and as an investment strategy, which provides an economic explanation for the testing result. Overall, there is no sparsity in asset pricing in the large space of characteristic-based factors.

Useful Factors Are Fewer Than You Think After Accounting for False-Discovery

with Bin Chen and Qiyang Yu (current version: February, 2024).

We examine how many factors out of a wide range of 207 that have incremental information in explaining cross-sectional stock returns. First, we find that the significance of each factor changes drastically over time. After accounting for false discovery rate (FDR), only 157 out of 207 factors are significant from 1967 to 2021, and only 56 from 2000 to 2021. Second, from 2000 to 2021, we find strikingly that only 3 clusters of factors that have incremental information. We further propose a new flexible time-varying latent factor model, and test in an alternative way on the number of factors that capture the information of the 56 significant factors while controlling for FDR, and find only 3, the market plus 2 latent ones, a number much fewer than widely believed.

Interpretable Factors of Firm Characteristics

with Yuxiao Jiao and Yingzi Zhu (current version: February, 2024).

We propose a new approach to construct factors from firm characteristics. In contrast to existing studies, each of our factors comes from the same group of statistically related firm characteristics, making its economic interpretation possible. The number of groups is not chosen ad hocly, but rather determined by data. Applying our method to a set of 94 representative firm characteristics, we find that the factors chosen by our approach are not only easy to interpret economically, but they also outperforms typical machine learning models. We also apply our approach to the recent and highly effective IPCA model of Kelly, Pruitt and Su (2019), and find that our factors not only are well linked apparent economic risks, but also can price assets no worse than the standard IPCA model.

Presented at 2024 AFA (poster) and 2024 CICF.

Empirical Asset Pricing with Probability Forecasts

with Songrun He and Linying Lv (current version: February, 2025).

We study probability forecasts in the cross-section of asset pricing and find that simple probability forecast models can perform as well as sophisticated ones, all of which deliver Sharpe ratios comparable to the best of existing return forecast models. Combining probability forecasts with return forecasts yields superior portfolio performance versus using each alone. Additionally, probability forecasts augment existing factor models and improve tail risk forecasts significantly. The results suggest that probability forecasts, so far largely ignored, can offer unique and valuable insights into understanding the cross-section of stock returns.

Presented at 2025 AFA.

How Accurate Are Survey Forecasts on the Market?

with Songrun He, Jiaen Li and Linying Lv (current version: March, 2025).

We find that three widely used survey forecasts fail to predict the stock market out-of-sample, raising important questions about the reliability of survey forecasts and the proper interpretation of the extensive literature that depends on them. In contrast, we demonstrate that a naive Bayesian learning model and analysts' expectations can significantly predict the stock market out-of-sample. This suggests that these alternatives provide more meaningful insights into investors' attitudes toward risk. As a result, studying these new sources of information may be more impactful and warrants greater attention compared to the reliance on survey forecasts.

Principal Portfolios: The Multi-Signal Case

with Songrun He and Ming Yuan (current version: October, 2022).

In this paper, we extend Kelly, Malamud, and Pederson"s (2021) new asset pricing framework to allow incorporating multiple predictive signals into optimal principal portfolios. Empirically, we find that the multi-signal theory is valuable for combining signals, improving a naive combination of single signal principal portfolios.

Anomaly Returns and FOMC

with Lin Tan and Xiaoyan Zhang (current version: April, 2023).

We find that anomaly returns are generally unchanged during FOMC days. The average return on the long- and short-leg, of a comprehensive set of 207 anomalies, increases by 26.3 bps and 28.8 bps, respectively, prior to the FOMC and reverses back afterwards. But for a small group of anomalies that do have substantial changes, their profitability tends to go down with absolute pricing errors greater than usual. Our evidence challenges existing studies that find the CAPM perform better during the FOMC period. Furthermore, we uncover that the less participation of retail investors contributes to the decline of profitability.

Presented at 2023 CICF.

Information Transmission from Corporate Bonds to the Aggregate Stock Market

with Sophia Zhengzi Li and Peixuan Yuan (current version: November, 2024).

We provide perhaps the first empirical evidence that the corporate bond market leads the aggregate stock market: the term-structure slope of the bond returns predicts the stock market returns both in- and out-of-sample. This predictability arises from informed bond trading and gradual information diffusion due to market segmentation. Additionally, the bond slope contains valuable information about future firm fundamentals and real economic activity. The lead-lag relationship becomes more prominent when the two markets are less integrated. Moreover, the predictive power extends to stock portfolios sorted by size, value, and industries, particularly those with high credit risk exposures.

Presented at CICF 2024.

Unspanned Risk and Risk-Return Tradeoff

with Huacheng Zhang (current version: Dec, 2023).

A major tenet of modern finance is the risk-return tradeoff, and yet there is a lack of empirical evidence supporting it. We provide an unspanned risk explanation, which, measured as uncertainty beyond financial markets, is well approximated by the macro uncertainty index of Baker, Bloom, and Davis (2016), 90% of which can be attributed to unspanned uncertainty. We find the first out-of-sample evidence that there is a positive risk-return tradeoff after all. In addition, we find that the unspanned risk matters at stock level too: a high-minus-low unspanned risk portfolio can generate an annualized return of 3.5%.

Hide in the Herd: Macroeconomic Uncertainty and Analyst Forecasts Dispersion

with Shen Zhao (current version: Jan., 2023).

We uncover a negative correlation between macroeconomic uncertainty and security analyst earning forecasts dispersion, and explain it through herding behavior bias of the analysts. We find that the herding firms, whose analysts suffer the herding bias, have greater firm-level uncertainty than non-herding firms. The stock prices of the herding firms have stronger momentum and tend to under-react more to the both firm and macro news. Moreover, the herding firms' stocks are more likely to be overpriced and earn lower subsequent returns. Our study links the interaction between macro-uncertainty and micro-dispersion to the firms' characteristics and our findings support the notion that greater uncertainty leaves more room for psychological biases, which further leads to informational inefficiency.

Betting Against the Crowd: Option Trading and Market Risk Premium

with Jie Cao, Gang Li and Xintong Zhan (current version: March, 2025).

We comprehensively study how option trading influences the equity market risk premium. Surprisingly, we find that trading of individual call options predicts the market index more strongly than index options. This predictability is both statistically significant and economically substantial, persisting from weeks to months. Aggregate individual options trading largely reflects investor sentiment and is primarily driven by retail investors. It also forms the key component in an ensemble learning model, combined with index option trading and other related predictors, respectively. Among all predictors examined, option trading emerges as the most powerful predictor of the market risk premium.

Presented at 2023 CICF.

Commodity Inflation Risk Premium and Stock Market Returns

with Ai Jun Hou, Emmanouil Platanakis and Xiaoxia Ye (current version: Dec, 2023).

We propose a novel measure of commodity inflation risk premium (cIRP) based on a term structure model of commodity futures. The cIRP, capturing forward-looking information in the futures markets, outperforms well-known characteristics in explaining the cross-section of commodity returns. The associated cIRP factor has the highest Sharpe ratio among the existing factors, and has substantial new information beyond them. Moreover, various aggregations of the individual cIRP predict stock market returns significantly, even after controlling for major economic predictors including the usual inflation measure. The link between commodities and the stock market is stronger than previously thought.

International Corporate Bond Returns: Uncovering Predictability Using Machine Learning

with Delong Li, Lei Lu and Zhen Qi (current version: November, 2024).

This paper examines cross-sectional predictability of corporate bond returns using a novel international dataset and machine learning techniques. We find significant predictability in both U.S. and non-U.S. markets, with predicting factors differing substantially. Downside risk and illiquidity have a greater influence on corporate bond returns in non-U.S. markets. We further show that corporate bonds in developed economies, compared with those in emerging markets, are more integrated with the U.S. corporate bond market. Developed economies also have a stronger integration between corporate bonds and stocks. These findings shed light on bond pricing and diversification opportunities among international corporate bond markets.

Heterogeneous Responses in Financial Markets: Insights from Machine Learning

with Xiaoxiao Tang and Xiwei Tang (current version: December, 2024).

We propose a machine learning framework that extends the Fama-MacBeth regression to capture individual-level heterogeneity in stock returns, offering greater flexibility in estimating expected returns in the cross section. Leveraging 15 representative firm characteristics to forecast returns, our model nearly doubles the Sharpe ratio of the long-short portfolio compared to the standard Fama-MacBeth approach. Furthermore, our approach is more interpretable and outperforms other machine learning models, even in high-dimensional settings involving 94 characteristics. This study highlights the critical role of stock-level heterogeneity, especially during recessions, and challenges the assumption of homogeneity inherent in the traditional Fama-MacBeth model.

Presented at CICF 2024.

Asymmetry in Variance: Does It Matter to Stock Returns?

with Xiaoxiao Tang (current version: December, 2023).

We propose a new measure, AVar, of asymmetry in variance of an asset return. Theoretically, we link the ranking of stocks by AVar based on the physical measure to that based on the risk-neutral measure, enabling us to use forward-looking information from the options market to estimate AVar. Empirically, we find that, in the cross section of stocks, the greater the AVar, the greater the stocks returns. The term structure of AVar also reflects future time variation in stock returns. Economically, we explain the compensation for bearing the asymmetry risk by interpreting AVar, under certain conditions, as a measure that effectively reflects the asymmetry in investors’ utility curvature between losses and gains, thus highlighting investors’ greater disutility from losses compared to equivalent gains.

Do Labor Flows Matter in the Stock Market?

with Jian Chen, Chunmian Ge, Nan Li and Jiaquan Yao (current version: Dec., 2024).

(On-line Appendix)

Using data from individual resumes of employees at public firms, we propose a novel measure of monthly innovations in aggregate labor market flows. Our findings reveal that this measure significantly predicts one-month-ahead market returns, both in- and out-of-sample. This predictability cannot be explained by existing labor or macroeconomic predictors. Further analysis indicates that the new labor market predictor captures information about firms’ investment growth, which is not efficiently incorporated into stock prices. This inefficiency arises from investor underreaction, driven by factors such as inattention, heightened information uncertainty, and the slow diffusion of negative news. Our study also reconciles and extends existing research that predominantly focuses on hiring and market returns over longer time horizons or in cross-sectional contexts. By providing robust evidence of the short-term predictability of the aggregate market risk premium, we underscore the labor market’s critical role in conveying economic information and affecting the systematic risk of the broad asset markets.

ESG and the Market Return

with with Liya Chu, Kent Wang and Bohui Zhang (current version: Dec., 2022).

We propose an environmental, social, and governance (ESG) index. We find that it has significant power in predicting the stock market risk premium, both in- and out-of-sample, and delivers sizable economic gains for mean-variance investors in asset allocation. Although the index is extracted by using the PLS method, its predictability is robust to using alternative machine learning tools. We find further that the aggregate of environmental variables captures short-term forecasting power, while that of social or governance captures long-term. The predictive power of the ESG index stems from both cash flow and discount rate channels.

Presented at CICF 2024.

Does Compensation Matter? Evidence from CD&A Disclosures

with Xiumin Martin and Jie (Jane) Xu (current version: April, 2021).

We study whether the similarity of firm disclosures on the Compensation Discussion and Analysis (CD&A) has predictability for future stock returns. We find that changes to the language and construction of the CD&As predict firms' future stock returns. A portfolio that longs the CD&A "non-changers" and shorts the "changers" earns a significant Fama-French 5-factor alpha of 5.86% (annualized), for the period of 2008-2020. We further find that companies with low CD&A similarities invest less in R&D, are more likely to be targeted by short-sellers, and have greater forced CEO turnovers. Our results provide new and strong evidence on the role of executive compensation in the cross-section of stock returns.

Lottery Preference and Anomalies

with Lei Jiang, Quan Wen, and Yifeng Zhu (current version: Nov., 2022).

We construct a lottery factor based on 13 commonly used lottery proxies and show that this factor adds significant explanatory power to prominent factor models for anomalies, especially for those in the skewness and value groups. We find that anomaly returns are significantly stronger among stocks with high lottery features and are mainly driven by the short leg of lottery stocks instead of financial distress. We find further that lottery stocks are often associated with low short volume and high shorting fees, indicating that retail investors' preference to hold lottery stocks leads to a low lendable supply of such shares.

Economic Fundamentals and Short-Run Exchange Rate Prediction: A Machine Learning Perspective

with Ilias Filippou, David Rapach and Mark Taylor (current version: Jan., 2025)

(Appendix)

This paper establishes the out-of-sample predictability of monthly exchange rates based on economic fundamentals using country characteristics, global variables, and their interactions. Previous work does not find consistent evidence of short-horizon predictability, likely due to using a small set of fundamentals and inadequately capturing time variation and nonlinearities in predictive relations. By employing a large set of economic fundamentals and global variables in conjunction with machine-learning techniques, we are able to consistently and significantly outperform the stringent no-change benchmark forecast. We find stronger predictability during periods of crisis and recession. The exchange rate forecasts are also economically valuable, as they generate sizable utility gains for an investor in the context of foreign currency portfolios. To enhance our understanding of the economic drivers of exchange rate predictability, we identify the most relevant predictors for forecasting exchange rates in the fitted machine-learning models.

Presented at Vienna Symposium on Foreign Exchange Markets, 2021, and 5th Workshop in Financial Markets and Nonlinear Dynamics, 2021.

Fundamental Extrapolation and Stock Returns

with Dashan Huang, Huacheng Zhang and Yingzi Zhu (current version: January, 2022).

We propose an economic objective-driven pooling strategy to extrapolate multiple fundamentals simultaneously. This strategy outperforms naive extrapolation strategies that use a single fundamental variable and strategies that use past prices or analyst forecasts, and performs similarly as a machine learning-based pooling strategy. We propose a model to show that fundamental extrapolation has dual price effects: a cash flow effect that pushes stock price up relative to its fundamental value and a discount rate effect that depresses stock price via increasing the expected volatility. Our empirical results suggest that the discount rate effect dominates the cash flow effect.

Presented at AFA 2022 and EFA 2020.

Firm Fundamental Cycles

with Yufeng Han, Zhaodan Huang, and Weidong Tian (current version: Feb., 2025).

We present the first study of firm cycles analogous to business cycles in the broader economy. To identify the cycles, we construct two novel firm fundamental indexes that capture a wide range of business activities. Empirically, firm cycles help explain key market anomalies, including momentum, long-term reversal, and factor momentum. Additionally, we develop an equilibrium model in which production technology growth follows a mean-reverting process, suggesting that firm momentum (reversal) emerges following positive (negative) technology shocks. Furthermore, the model explains factor momentum, which is driven by firm characteristics and market-wide conditions but remains independent of stock momentum.

Best Paper Award, The World Finance Conference, 2019

Twin Momentum: Fundamental Trends Matter

with Dashan Huang, and Huacheng Zhang (current version: June, 2021).

Using trends in firm fundamentals, we find that there is a fundamental momentum in the stock market. Buying stocks in the top quintile of fundamental trends and selling stocks in the bottom earns a monthly average return of 0.85% comparable to price momentum. Combining both price and fundamental momentum produces a twin momentum, that earns an average return that exceeds their sum and is difficult to explain by short-sell impediment. Our results not only support the view that fundamental analysis is as important as technical analysis, but also indicate that trends contain incremental information beyond often used lagged fundamental predictors.

Sparse Macro Factors

with David Rapach (current version: January, 2021).

(Factor Data: Mkt, Yield and Housing)

We use machine-learning techniques to estimate sparse principal components (PCs) for 120 monthly macroeconomic variables from the FRED-MD database. Each sparse PC is a sparse linear combination of the underlying macroeconomic variables, whose active weights allow for their economic interpretation. Innovations to the sparse PCs constitute a set of sparse macro factors. Robust tests indicate that sparse macro factors corresponding to yields and housing earn statistically and economically significant risk premia. A three-factor model comprised of the market factor and mimicking portfolio returns for the yields and housing factors performs well compared to leading multifactor models in explaining numerous anomalies.

Best paper award, Inquire UK and Inquire Europe, 2019

Corporate Bond Models: A New Performance Metric

with Xu Guo, Hai Lin and Chunchi Wu (current version: Feb., 2025).

The pricing error (PE) from the IPCA model of Kelly, Palhares, and Pruitt (2023) negatively predicts corporate bond returns in the cross-section, and those PEs from other models have even much greater predictability. A long-short PE portfolio based on the IPCA generates an average monthly return of 0.83%, which is economically significant and robust to using various factors and model specifications. Further analysis indicates that investor sentiment is a plausible driver of the PE predictability. Extending the IPCA model to include nontradable sentiment and macroeconomic uncertainty factors, we find that market sentiment and uncertainty play a role in the PE anomaly.

Presented at 2022 CICF.

An Information Factor: Can Informed Traders Make Abnormal Profits?

with Matthew Ma, Xiumin Martin and Matthew Ringgenberg (current version: September, 2019).

We construct an information factor (INFO) using the informed stock buying of corporate insiders and the informed selling of short sellers and option traders. INFO strongly predicts future stock returns -- a long-short portfolio formed on INFO earns monthly alphas of 1.24%, substantially outperforming existing strategies including momentum. INFO explains hedge fund returns in the time-series and cross-section. Higher values of INFO are associated with increases in aggregate hedge fund value. Moreover, funds with higher covariation between their returns and INFO outperform by 0.28% per month. The results show information processing skill is an important source of return variation.

Corporate Activities and the Market Risk Premium

with Erik Lie, Bo Meng and Yiming Qian (current version: October, 2017).

While existing asset pricing studies focus on macroeconomic variables to predict stock market risk premium, we find that an aggregate index of corporate activities has substantially greater predictive power both in- and out-of sample, and yields much greater economic gain for a mean-variance investor. The predictive ability of the corporate index stems from its information content about future cash flows. Cross-sectionally, the corporate index performs particularly well for stocks with great information asymmetry.

Sentiment Across Asset Markets

with Dashan Huang, Heikki Lehkonen and Kuntara Pukthuanthong (current version: June, 2018)

In this paper, we study investor sentiment in five major asset markets: stocks, bonds, commodities, currencies, and housing. Based on Thomson Reuter's sentiment measures extracted from 235 news and social media sources, we find that each market is predicted by its own sentiment. Cross-markets, kitchen sink regressions reveal that the stock market is influenced only by bond sentiment, while bond market is affected just by currency market, which is largely unexplained by others; the commodities are related to currencies and housing, and housing can be predicted by stock and bond sentiment. In an efficient information aggregation by the partial least square (PLS), the predictability of each market increases substantially by using information of all markets vs using only its own sentiment.

Cost Behavior and Stock Returns

with Dashan Huang, Fuwei Jiang and Jun Tu (current version: April, 2017).

This paper shows that investors do not fully incorporate cost behavior information into valuation. Firms with higher growth in operating costs generate substantially lower future stock returns. A long-short spread portfolio earns an average return of about 12% per year after controlling for extant risk factors and firm characteristics. Mean-variance spanning tests show that an investor can benefit from investing in this spread portfolio in addition to well-known factors. Firms with high cost growth also suffer from deteriorations in future operating performance. The negative cost growth-return relation is much stronger around earnings announcement days, among firms with lower investor attention, higher idiosyncratic volatility, and higher transaction costs, suggesting that investor underreaction and limits to arbitrage mainly drive the effect.

Taming Momentum Crashes: A Simple Stop-loss Strategy

with Yufeng Han and Yingzi Zhu (current version: August, 2015).

In this paper, we propose a stop-loss strategy to limit the downside risk of the well-known momentum strategy. At a stop-level of 10%, we find, with data from January 1926 to December 2013, that the maximum monthly losses of the equal- and value-weighted momentum strategies go down from -49.79% to -11.36% and from -64.97% to -23.28%, while the Sharpe ratios are more than doubled at the same time. We also provide a general equilibrium model of stop-loss traders and non-stop traders and show that the market price differs from the price in the case of no stop-loss traders by a barrier option.

Which Hedge Fund Styles Hedge Against Bad Times?

with Charles Cao and David Rapach (current version: February, 2015).

We examine hedge fund style performance in bad versus good times defined as (1) up and down equity market regimes derived from the 200-day moving average of the S&P 500 price index or (2) nonstressed and stressed financial market regimes determined endogenously using the Federal Reserve Bank of Kansas City Financial Stress Index and threshold estimation. We show that hedge fund styles often exhibit significant changes in risk factor exposures across good and bad times. For certain hedge fund styles, changes in factor exposures represent valuable hedges against bad times; in contrast, other hedge fund styles become more exposed to risk factors during bad times in a manner that magnifies downside risk exposure. In the context of “balanced” 40-30-30 portfolios that allocate across U.S. stocks, bonds, and individual hedge fund styles, we find that the Global Macro, Managed Futures, and Multi-Strategy styles provide investors with especially valuable hedges against bad times.

Forecasting Bond Risk Premia Using Technical Indicators

with Jeremy Goh, Fuwei Jiang, and Jun Tu (current version: July, 2013).

While economic variables have been used extensively to forecast the U.S. bond risk premia, little attention has been paid to the use of technical indicators which are widely employed by practitioners. In this paper, we fill this gap by studying the predictive ability of using a variety of technical indicators vis-a-vis the economic variables. We find that the technical indicators have statistically and economically significant in- and out-of-sample forecasting power. Moreover, we find that utilizing information from both technical indicators and economic variables substantially increases the forecasting performances relative to using just economic variables.

Forecasting Stock Returns During Good and Bad Times

with Dashan Huang, Fuwei Jiang and Jun Tu (current version: May, 2015).

We show that stock returns can be significantly predicted by past realized returns in both good and bad times, in and out of sample. We extend the model in Fama and French (1988) to show that stock returns display mean reversion and momentum over time, which is dependent on the market state. Specifically, past stock returns predict future returns negatively in good times and positively in bad times, which is consistent consistent with the change and level effects in P´astor and Stambaugh (2009).

Hansen-Jagannathan Distance: Geometry and Exact Distribution

with Raymond Kan; November, 2002.

This paper provides an in-depth analysis of the Hansen-Jagannathan (HJ) distance, which is a measure that is widely used for diagnosis of asset pricing models, and also as a tool for model selection. In the mean and standard deviation space of portfolio returns, we provide a geometric interpretation of the HJ-distance. In relation to the traditional regression approach of testing asset pricing models, we show that the HJ-distance is a scaled version of the aggregate pricing errors, and it is closely related to Shanken's (1985) cross-sectional regression test (CSRT) statistic, with the only major difference in how the zero-beta rate is estimated. For the statistical properties, we provide the exact distribution of the sample HJ-distance and also a simple numerical procedure for computing its distribution function. In addition, we propose a new test of equality of HJ-distance for two nested models. Simulation evidence shows that the asymptotic distribution for sample HJ-distance is grossly inappropriate for typical number of test assets and time series observations, making the small sample analysis empirically relevant.

Toward a Better Understanding of the Beta Method and the Stochastic Discount Factor Method

with Raymond Kan; May, 2002.

In a standardized factor model, Kan and Zhou (1999) show the stochastic discount factor (SDF) method yields less efficient estimates than the beta method when both are based on the generalized method of moments (GMM). By modifying the common use of the SDF [via adding more moment conditions to the practice before the publication of Kan and Zhou (1999)], Jagannathan and Wang (2001) and Cochrane (2000a,b) find that the two methods have the same asymptotic variance for the new GMM estimator (which no longer admits analytical solution). Moreover, their analysis relies on a joint normality assumption of both the asset returns and factors. In this paper, we show that: 1) once the normality assumption is relaxed, the modified SDF method is highly sensitive to factor skewness and kurtosis whereas the beta method is not, implying that the SDF estimates can be less reliable in realistic situations where the factors are leptokurtic; 2) in conditional asset pricing models, the modified SDF is in general still strictly dominated by the beta method in terms of estimation accuracy; 3) while it is not well understood and almost never used in the SDF formulation of an asset pricing model, the maximum likelihood method is well defined and has both strictly more efficient estimates and more powerful tests than the SDF method; 4) the SDF tests can have much less power than the beta method in conditional asset pricing models. In short, while the SDF set-up is an elegant theoretical formation, empirical estimation and tests should pay as much attention to the beta method as to the SDF if not more (one more reason is that, as shown by Jagannathan and Wang (2001), estimated model pricing errors have smaller variance by using the beta method than the SDF one).

The End