Abstract: Social media services devote significant effort to mitigate the impact of misleading information but achieve only mixed results. This paper analyzes the strategic implications of such content moderation practices and the factors that determine their effectiveness.
I propose a model of indirect persuasion, in which a sender (political campaigner) tries to influence a receiver's (voter) action subject to intervention by a moderator (platform). The moderator publicly announces a moderation policy specifying how it will filter messages from the sender. Given the policy, the sender then chooses the accuracy and bias of its messages. The moderator then privately observes the sender's message and makes a recommendation. With an exogenous probability, the recommendation follows the policy prescription; otherwise the moderator is free to recommend its preferred action instead, which may also differ from the receiver's optimal choice. Finally, a receiver chooses an action that affects all three players' payoffs based solely on the moderator's recommendation.
In equilibrium, the moderator's commitment power and preferences jointly contribute to the strictness of policies it can enforce. A policy is considered stricter if it requires the sender to reveal more information. Conditional on compliance, stricter policies benefit the receiver but create stronger incentives for the sender to adopt an opportunistic strategy that influences the receiver only when the moderator has a chance to revise its policy ex-post. Reconciling the tension between ex-ante information elicitation and ex-post revelation, the optimal policy is just lenient enough to make the sender always comply with its requirement. Therefore, optimal content moderation relies on its deterrence effect. A platform can efficiently curtail misinformation production without actively removing user-generated posts if it commits to a set of well-defined and widely-applicable standards. As a prominent example, some news outlets persistently spread inaccurate information against fact-checking rules on Facebook, because the platform sometimes withholds penalty to violators over fear about accusations of bias. The model implies that Facebook can improve the effectiveness of content moderation by relaxing the formal rules and eliminating deviations in execution.
Work in Progress
Endogenous Polarization in Word-of-Mouth Learning
Abstract: This paper studies a flexible class of non-Bayesian learning rules on social networks by which agents update their beliefs about an unknown state through repeated communication with each other. When every agent is truthful, these rules can represent learning behaviors motivated by common behavioral heuristics. The population converges to a consensus under weak connectivity conditions. However, the possible existence of untruthful agents introduces a dilemma between learning from the wisdom of the crowd and curtailing the spread of misinformation. The logic of convergence implies that the consensus is easily manipulable. A truthful agent on a connected network can avoid convergence to a false belief only if she stops learning from anyone who is indirectly influenced by misinformation. Without common knowledge of who the manipulators are, even truthful agents on a highly connected network can be locked in isolated groups, each with a different long-term belief.
Less Privacy for Better Service? (with Dirk Bergemann)