Language of Bargaining (2024)

Mourad Heddaya
University of Chicago
mourad@uchicago.edu &Solomon Dworkin
University of Chicago
solomon.dworkin@gmail.com &Chenhao Tan
University of Chicago
chenhao@uchicago.edu\ANDRob Voigt
Northwestern University
robvoigt@northwestern.edu &Alexander Zentefis
Yale University
alexander.zentefis@yale.edu

Abstract

Leveraging an established exercise in negotiation education, we build a novel dataset for studying how the use of language shapes bilateral bargaining.Our dataset extends existing work in two ways: 1) we recruit participants via behavioral labs instead of crowdsourcing platforms and allow participants to negotiate through audio, enabling more naturalistic interactions;2) we add a control setting where participants negotiate only through alternating, written numeric offers.Despite the two contrasting forms of communication, we find that the average agreed prices of the two treatments are identical. Butwhen subjects can talk, fewer offers are exchanged, negotiations finish faster, the likelihood of reaching agreement rises, and the variance of prices at which subjects agree drops substantially.We further propose a taxonomy of speech acts in negotiation and enrich the dataset with annotated speech acts. Our work also reveals linguistic signals that are predictive of negotiation outcomes.

1 Introduction

Bilateral bargaining, in the sense of a goal-oriented negotiation between two parties, is a fundamental human social behavior that takes shape in many areas of social experience.Driven by a desire to better understand this form of interaction, a rich body of work in economics and psychology has evolved to study bargaining (Rubin and Brown, 1975; Bazerman etal., 2000; Roth, 2020).However, this work has seldom paid careful attention to the use of language and its fine-grained impacts on bargaining conversations; indeed, many studies operationalize bargaining as simply the back-and-forth exchange of numerical values.Meanwhile, there is growing interest in bargaining in NLP oriented towards the goal of building dialogue systems capable of engaging in effective negotiation (Zhan etal., 2022; Fu etal., 2023).In this work, we aim to bridge these two lines of work and develop a computational understanding of how language shapes bilateral bargaining.

To do so, building on a widely used exercise involving the bargaining over the price of a house used in negotiation education, we developa controlled experimental environment to collect a dataset of bargaining conversations.¹¹1Dataset access may be requested at: https://mheddaya.com/research/bargaining The treatment in our experiment is the manner in which subjects communicate:either through alternating, written, numeric offers (the alternating offers or AO condition) or unstructured, verbal communication (the natural language or NL condition).Furthermore, to encourage naturalistic interactions, we recruit participants via behavioral labs and allow participants to negotiate in a conversational setting using audio on Zoom instead of crowdingsourcing text conversations as prior work has done (Asher etal., 2016; Lewis etal., 2017; He etal., 2018).In total, we collect a dataset with 230 alternating-offers negotiations and 178 natural language negotiations. In contrast with He etal. (2018)’s Craigslist negotiation dataset, our natural language negotiations have an average of over 4x more turns exchanged during each conversation, so our dataset represents a richer source to explore linguistic aspects of bargaining behavior than has been presented by existing work in this area.

In addition, we enrich the dataset by annotating all the conversations with a set of negotiation-specific speech acts.Inspired by prior work on rhetorical strategies in negotiations (Chang and Woo, 1994; Weigand etal., 2003; Twitchell etal., 2013),we create a simplified taxonomy of what we term bargaining acts and hire undergraduate research assistants to provide annotations.To the best of our knowledge, our dataset of speech acts in negotiations is an order of magnitude larger than existing datasets.

We first provide descriptive results based on our dataset.Although the AO and NL conditions are conducted via different communication mechanisms, they reach the same average agreed prices. However, when subjects can talk, fewer offers are exchanged, negotiations finish faster, the likelihood of reaching agreement rises, and the variance of prices at which subjects agree drops substantially.These observations suggest that the use of language facilitates collaboration.We also find differences in how buyers and sellers employ bargaining acts.

Recorded and transcribed speech provides more direct access to the intuitive attitudes and behaviors of the buyers and sellers. This enables us to identify subtle types of expressionthat are predictive of negotiation outcomes and reveal underlying dynamics of negotiation. Other findings corroborate conclusions from Lee and Ames (2017), who distinguish the effectiveness of negotiators’ different expressions of the same rationale.

We set up prediction tasks to predict the outcome of a negotiation based on features of the conversation and analyze the important features contributing to class differentiation.Our results show that LIWC features provide consistently strong performance and even outperform Longformer (Beltagy etal., 2020) given the beginning of a negotiation.Important features reveal thatsuccessful sellers drive and frame the conversation early on by using interrogative words to prompt buyers with targeted questions, while successful buyers convey their personal considerations and concerns while using negative expressions to push for lower prices.

In summary, we make the following contributions:

•
We build a novel dataset of bargaining and provide annotations of bargaining acts.
•
We demonstrate that the ability to communicate using language facilitates cooperation.
•
Our work reveals linguistic signals that are predictive of negotiation outcomes. For instance, it is advantageous to drive the negotiation, rather than to be reactive to the other party’s arguments.

2 Related Work

Negotiation is a growing area of study in computer science. Zhan etal. (2022) provide an excellent survey of research on negotiation dialogue systems. Lewis etal. (2017) train recurrent neural networks to generate natural language dialogues in negotiations. He etal. (2018) propose a modular generative model based on dialogue acts.Our focus is on deriving computational understanding of how language shapes negotiation.

Several research disciplines have studied bilateral bargaining from different perspectives and using different tools.Economic theory has investigated the role of incomplete information (Ausubel etal., 2002) and highlighted the role of explicit communication (Crawford, 1990; Roth, 2020).Bazerman etal. (2000) and Pruitt (2013) provide an overview of the psychology literature on negotiation.However, these studies tend to overlook the content of the communication, with some notable exceptions (Swaab etal., 2011; Jeong etal., 2019; Lee and Ames, 2017).

The most related work to ours is Lee and Ames (2017), who study how bargaining outcomes are affected by the way a rationale is expressed. They find that expressions that hint at a constraint (e.g., “I can’t pay more”) are more effective at shaping a seller’s views of the buyer’s willingness to pay than critique rationales (e.g., “it’s not worth more”).

3 Dataset

The first contribution of our work is building the first transcript dataset of spoken natural language bargaining between lab experiment participants.Our dataset extends existing datasets in four ways:

1.
Negotiation happens in spoken language, and is thus more fluid and natural, akin to real-world bargaining scenarios, such as price haggling in vendor markets, union negotiations, or diplomacy talks, while existing work is largely based on written exchanges(Asher etal., 2016; Lewis etal., 2017; He etal., 2018);
2.
Our work is the first one to introduce a control condition without the use of natural language;
3.
Participants are recruited through behavioral labs at universities and their incentive structure is more high-powered (i.e., bonus earnings based on outcomes and payment exceeding the typical $12 hourly wage) than for a crowdworker on Amazon Mechanical Turk;
See Also
PART - PART 226—DESIGNATED CRITICAL HABITAT
4.
We supplement the transcripts with manual annotation of speech acts (see §4).

While contributing greatly to our understanding of negotiation, existing bargaining datasets are somewhat limited in being based on written exchanges (He etal., 2018), often in the context of a highly structured game (Asher etal., 2016; Lewis etal., 2017).

Experiment design.

We conducted a controlled experiment whose setting reflected a common life experience: the purchase or sale of a house.We adapted the setting in “Buying a House” by Sally Blount, a popular exercise from the Dispute Resolution Research Center (DRRC) of Northwestern University’s Kellogg School of Management (Blount, 2000).²²2Thanks to the DRRC for kindly granting us permission to base our bargaining setting on this negotiation exercise that teaches purely distributive (i.e., zero-sum) bargaining between two parties.We randomly paired participants and each was assigned the role of buyer or seller. In each pairing, buyer and seller negotiated a price of the house anonymously.Both buyer and seller were aware of the listing price of $240,000 and shared the same descriptions of the house and surrounding area, along with recent sales prices of comparable homes.However, each participant was given a private valuation of the house ($235,000 for the buyer and $225,000 for the seller).

Participant bonus earnings depended on bargaining outcomes to incentivize subjects to engage in realistic negotiating behavior. If no agreement was reached, neither party earned bonus money. On an hourly basis, compensation seemed significant enough to influence participant behavior (i.e., at least $40/hour was on the table per round). On average, subjects earned roughly $23.25/hour. More details can be found in Appendix B.

Each subject participated in two bargaining rounds. In one round, a buyer-seller pair communicated via alternating offers (AO) in an online chat that only accepted numeric entries. Each participant could choose to accept or counter each offer they received. In the other round,participants played the same role, either buyer or seller, but were assigned a new partner. In this round, each pair communicated in natural language (NL) via audio only on Zoom (videos were monitored to be turned off to avoid signals from gesture and facial expressions). The subjects were restricted from disclosing theirprivate value and compensation structure and informed that doing so would result in forfeiture of their earnings.³³3To control for the order of the two treatments affecting the bargaining outcomes, roughly half the sessions (58% of the negotiations) first began with the round of alternating offers, whereas the other half began with the round of natural language. We did not detect any ordering effects.Our experiment is approved by the IRB at Yale University.

Preprocessing.

We transcribed the audio from the Zoom negotiation settings using Amazon Transcribe.Transcription produces strictly alternating seller and buyer turns, without sentence segmentation. We use the resulting transcripts for the annotation and analyses described in this paper. We trim the end of each negotiation at the point of agreement on a final price for the house, discarding any interaction that occurs subsequently. We describe in §4 the annotation procedures that allowed us to reliably identify the point of agreement.

	Alternating	Natural
	Offers	Language
No. of Turns	29.2	42.50
No. of New Offers	17.9	6.06
No. of Repeat Offers	11.3	1.56
Duration (min)	9.5	6.5
Avg Turn Length (sec)	28.9	12.54
Prob. of Agreement (%)	90.0	97.19
Agreed Price ($000s)	229.9	229.8
No. of Negotiations	230	178
No. of Unique Participants	460	356

Descriptive statistics.

Table 1 provides descriptive statistics of the AO and NL treatments.Since a failed negotiation results in no bonus for both sides, most negotiations end with a successful sale.Nevertheless, the probability of agreement is roughly 7 percentage points higher under NL than AO (97.2% versus 90.0%). A two-tailed t-test with heteroskedasticity-robust standard errors shows that thedifference in agreement probability is significant.Moreover, in contrast with the AO treatment, the NL treatment produces negotiations that, on average, have $\sim$ 1.5x more turns, but NL turns are over 50% shorter in duration, and NL negotiations are roughly 30% shorter in total duration and feature about 74% fewer offers.

Surprisingly, without the ability to communicate using language, buyers and sellers are less efficient in reconciling their differences. In the AO treatment, the combination of fewer turns that are each, individually, longer in duration is telling. Interlocutors are spending more time silently strategizing and considering their next act. However, this time invested is not fruitful individually nor at the level of coordination, as exemplified by a lower probability of agreement and equivalent agreed prices among successful negotiations,likely due to an impoverished channel of communication.

Bargaining Act	Definition	Example
New offer	Any numerical price, not previously mentioned, that is offered by either the buyer or seller throughout the course of the negotiation.	That’s still $30,000 out of my budget but I would be willing to pay 210,000
Repeat Offer	Any numerical price presented that is an exact repeat of a previously presented offer; in a literal sense, these are redundant offers that were already on the table.	Yeah I understand um you still think that to 240,000 is too high right
Push	Any overt linguistic effort made by either party to bring the other party’s offer closer to theirs.	Might just be a little too low for what I have to offer here
Comparison	Evokes a difference or similarity between an aspect of the seller’s house and other external houses or considerations.	Like there’s one for 213k Which is like smaller and it’s nearby so that’s closer to our budget, we’ve seen that apartment it’s not as like it’s not as furnished and it’s kind of old and so
Allowance	Any time either party adjusts their offer price closer to the other party’s most recent offer. An allowance may be interpreted as the accompanying interaction to a successful push act.	I mean really like it probably should be higher than 233 but we’re willing to drop it to 233
End	End of negotiation via offer acceptance entering mutual common ground - explicitly only happens once.	Alright 228 it is

Figure1shows that the distributions of agreed prices largely overlap between the two treatments, but the distribution in prices under NL is substantially narrower than under AO.Between the two treatments, the mean agreed price conditional on reaching agreement is identical ($229.8 thousand). However, the standard deviation of agreed prices under NL is about one-third of that under AO (3.1 versus 10.4). A Fligner-Killeen (FK) (Fligner and Killeen, 1976) two-sample scale test shows that the standard deviation of the AO price distribution is statistically larger than the NL counterpart.

4 Bargaining Act Annotation

Previous researchers have recognized the inherently speech-act-like character of negotiations (Chang and Woo, 1994; Weigand etal., 2003; Twitchell etal., 2013).Many or most utterances in a bargaining context can be thought of as taking some action with reference to the negotiation. Here we propose and present a simplified ontology of negotiation-oriented speech acts (hereafter, bargaining acts) relevant to the present context of negotiation.Two trained undergraduate research assistants annotated all transcripts according to five bargaining acts: 1) new offer, 2) repeat offer, 3) push, 4) comparison, 5) allowance, and 6) end. Table 2 provides definitions and examples.Note that each turn can include multiple bargaining acts.In addition, each speech act is also annotated with a numerical offer, if applicable.

Twenty-four transcripts were annotated by both annotators to allow agreement to be calculated. Using MASI distance weighting (Passonneau, 2006), we found a Krippendorff’s alpha (Hayes and Krippendorff, 2007) of 0.72, representing a high degree of agreement for a pragmatic annotation task.

Figure2 shows thatnew offers, pushes, and comparisons are relatively more frequent and appear more consistently in all the negotiations than allowances and repeat offers. We note in Table1 that repeat offers are dramatically more common in the AO condition than the NL condition (11.3 vs. 1.56 per negotiation). With linguistic context, negotiators are less likely to engage in fundamentally uncooperative behavior by simply repeating past offers over again.

Comparing buyers to sellers, we observe that buyers make on average 1 more new offers per negotiation than sellers (independent sample, heteroskedasticity robust t-test, $p=0.02$ ). We find no statistically significant differences between roles for the other five bargaining acts.

The bargaining act annotations allow us to describe a negotiation as a sequence of offers proposed by the buyer and seller. We compare how the frequency and pattern of numerical offers differ across 1) experimental treatments (NL vs. AO) and 2) negotiation outcomes.We characterize different properties of the negotiations as well as their trajectories over the course the interaction.

Figure3 reveals three general patterns on offer trajectories. First, both AO and NL bargaining feature a similar range of new offers exchanged in the early stages of the negotiation. Early on, buyers in both treatments present new offers as low as 170; and sellers, as high as 270. But extreme offers are more prevalent in AO than NL bargaining. Second, both the AO and NL trajectories exhibit a rhythmic pattern of low and high offers, which is familiar to real-world negotiations. The buyer’s low offer is countered by the seller’s high offer, which is then countered by the buyer’s slightly increased low offer, and so on. Third, NL bargaining takes far fewer new offers to reach agreement than AO bargaining. Figure2(b) clearly demonstrates that NL negotiations converge quicker, with consecutive offers converging to within $5K after 6 new offers. AO negotiations take over 40 new offer exchanges to reach a similar convergence.

5 Predicting Negotiation Outcomes

Finally, we set up prediction tasks to understand the relationship between the use of natural language and negotiation success.Overall, our models demonstrate performance gains over majority class in most settings. Surprisingly, Logistic Regression using bag-of-words and LIWC category features outperform the neural model. We observe differentiation between classification accuracy on seller only and buyer only speech, and highlight features that explain this difference.

5.1 Experimental Setup

Task.

We consider a binary classification task with two classes: 1) “seller win” and 2) “buyer win”, where a negotiation is classified by whether it concluded with an agreed price greater than $230K or less than $230K, respectively.We focus on negotiations that end with an advantage for either the buyer or seller to better understand the dynamics that produce an asymmetric outcome.Hence, we omit the negotiations that ended with $230K or that did not reach an agreed price.This leaves us 119 negotiations.

As the predictive task may become trivial if we see the entire exchange, we build 10 versions of each negotiation by incrementally adding proportions of the negotiation to the input with a step size of 10%.Thus, we obtain input/output pairs $(X_{k},y)$ for a given negotiation, where $k=\{10\%,\dots,100\%\}$ , and each $k$ corresponds to a different prediction task; namely, whether the negotiation outcome can be predicted by the first $k$ percentage of the interaction.

Methods.

We test two setups for our task. The first is a standard linear model with logistic regression. The second is an end-to-end approach using Longformer, a transformer-based model for encoding and classifying long sequences. In particular, we use the encoder and output modules of LongformerEncoderDecoder (LED) (Beltagy etal., 2020), a variant of the original Longformer model, which can encode sequences up to 16,384 tokens in length. This exceeds the maximum input length in our dataset.

In the logistic regression experiments, we treat the numerical offers as an oracle and consider three other feature sets:1) Transcription texts;2) Bargaining acts;3) LIWC categories Tausczik and Pennebaker (2010).⁴⁴4We also tried the union of these features, but it did not materially affect the performance.We represent each negotiation as a binary bag-of-words encoding of the features listed above. For bargaining acts, we construct the vocabulary based on unigrams and bigrams;for the other feature sets, we only include unigrams. We include bigrams for bargaining acts to capture local combinations of bargaining acts. To maintain a reasonable vocabulary size, we only consider unigrams from the transcribed text that occur in at least 5 negotiations (see Appendix C for total feature counts). We replace numbers mentioned in the text with a generic [NUM] token to eliminate the strongly predictive signal of new offers and focus on language instead.In experiments with LED, we add two special tokens [SELLER] and [BUYER] that we concatenate to the start of each turn depending on who is speaking. We make no other changes to the transcribed text. The input to LED is the concatenation of all the turns.

Evaluation.

We use accuracy as our main evaluation metric.In all experiments, due to the relatively small size of our dataset, we use nested five-fold cross validation for both inner and outer cross validations. For logistic regression, we grid search the best $\ell_{2}$ coefficient within $\{2^{x}\}$ , where $x$ ranges over 11 values evenly spaced between –10 and 1.We further concatenate the speaker (‘buyer’ or ‘seller’) and the turn position within the negotiation. We treat these as hyper-parameters.We represent the position as $k$ , where $k$ corresponds to a fraction of the conversation, as defined earlier. For example, the word “house” spoken by the seller in the first 10% of turns in a negotiation would be tokenized as “s1-house”.In the LED experiments, we omit the inner cross validation and use a batch size of 4, the largest possible batch size given our memory constraints.⁵⁵5We use a single Nvidia A40 GPU in our LED experiments. We select the best performing learning rate out of $\{5e-5,3e-4,3e-3\}$ and early stop based on training loss convergence.

5.2 Results

Predictive performance.

We start by looking at the overall predictive performance. Figure4 presents results for all models.For the oracle condition (numerical), as expected, prediction accuracy increases monotonically and steadily as the fraction of the conversation and the corresponding numerical offers in the input increases from $10\%$ to $100\%$ of the conversation.As the buyer and seller converge towards an agreed price, the offers made provide strong signal about the outcome.

However, this task proves much more challenging for other models where we do not include numerical offers provided by annotators.One intriguing observation is that LED consistently under-performs logistic regression.Within logistic regression, LIWC categories outperform other features and achieve 63.1% accuracy whereas text-based BOW features achieve a best score of 58.9%.Furthermore, there is no clear trend of performance growing as the fraction of negotiation increases.While bargaining actions under-perform other features overall, there is a notable jump in accuracy at fraction $30\%$ , which we will revisit later.

Buyer vs. seller.

In bilateral bargaining, an interesting question is which party drives the negotiation, and to what effect?To further understand the role of buyer vs. seller, we only consider features of buyer texts or seller texts.

Although the performance of LIWC does not vary much for buyer and seller texts (Figure4(a)),Figures 4(b) and 4(c) show contrasting differences in prediction accuracy for sellers and buyers at various fractions of a negotiation. Seller transcription text achieves ~10% higher accuracy than buyer and buyer + seller at fractions $20\%\>(p=0.01),30\%\>(p=0.01),90\%\>(p=0.001),100\%\>(p=0.01)$ . Meanwhile, buyer bargaining acts outperform seller acts throughout and are particularly effective at $40\%\>(p=0.008)$ and $50\%\>(p=0.03)$ of the negotiation.

Important features.

To understand in greater detail which features are more helpful for prediction, we compare the fitted logistic regression models’ feature coefficients.⁶⁶6We use the average coefficients of the five models in cross validation. Coefficients with the largest absolute values are associated with more discriminating features.

We first discuss features from LIWC, our best performing feature set (Table 2(a)).Interrogative words spoken by the sellers at the beginning of the negotiations (“s1-interrog”) are consistently and strongly predictive of seller wins. An example use by the seller is “so tell me about what you’re looking for in a house”.From the buyers’ points of view, it appears to be disadvantageous to use informal language, such as “mhm”, “k”, “yep”, and “huh”(“b1-netspeak”), especially at the beginning of the negotiation. One interpretation could be that the buyer signals a passivity, allowing the seller to drive the conversation and establish their asking price and justification for it.Overall, these two patterns suggest that sellers benefit from controlling the direction of the conversation early on.

10%	30%	50%	70%	90%
Buyer Win
s2-social, s1-time, s1-compare, b1-adj, b1-focuspast	s2-you, s2-social, s3-social, b3-posemo, s2-space	b3-posemo, s3-social, s2-space, s5-money, b4-negemo	s7-home, b4-negemo, s2-you, s2-cogproc, b4-money	b4-money, s8-bio, s2-interrog, b4-negemo, s2-space
Seller Win
b1-motion, b1-netspeak, b1-i, b1-focuspresent, s1-adverb	s1-interrog, b3-you, b1-netspeak, b1-motion, b3-bio	s1-interrog, b1-netspeak, b3-bio, s1-you, s1-conj	b3-bio, s1-interrog, b1-netspeak, b3-focusfuture, b3-reward	s1-interrog, b3-bio, b1-netspeak, b1-motion, s4-focuspast

10%	30%	50%	70%	90%
Buyer Win
b-push, b-push b-new,b-repeat b-push, b-new, b-new b-push	b-push b-compare, b-new b-compare, b-push, b-compare b-repeat, b-push b-repeat	b-new b-compare, b-new b-repeat, b-push b-compare, b-push, b-push b-new	b-push b-compare, b-new b-compare, b-new b-allow, b-push b-allow, b-allow b-push	b-push b-compare, b-push b-new, b-repeat b-push, b-push, b-new b-compare
Seller Win
b-new b-compare, b-repeat, b-push b-compare, b-compare b-push, b-compare	b-allow b-compare, b-allow, b-compare b-allow, b-compare b-push, b-compare	b-allow b-compare, b-new b-push, b-compare b-push, b-repeat b-new, b-new	b-compare, b-repeat b-allow, b-allow, b-new, b-allow b-compare	b-repeat b-allow, b-allow b-compare, b-compare b-allow, b-allow, b-compare b-compare

Furthermore, LIWC categories “money”, “space”, and “home” are associated with buyer success. These categories consists of seller spoken words like “area”, “location”, “floors”, and “room” and buyer spoken words like “budget”, “pay”, and “priced”, among many others, which are used in reference to various aspects of the house and its price. Discussion of these subjects often revolves around the seller first justifying their asking price (“s2-space”) then the buyer disputing the houses value or their ability to afford the seller’s price (“b4-money”). Additionally, buyer speech associated with negative emotions like “unfortunately”, “problem”, “sorry”, “lower”, and “risk” (“b4-negemo”) similarly appears 40% into the negotiation, along with mentions of money-related words.Buyers may benefit from moving the conversation away from concrete facts towards a discussion about what is an affordable or reasonable price for them. Crucially, successful buyers do so in a manner that portrays them as apologetic and considerate of the sellers’ interests.Given that the buyer requires movement on the asking price to succeed, they avoid language that explicitly acknowledges that the seller may be compromising their interests.This result echoes the important role of negative expressions on negotiation outcomes by Barry (2008).

Buyer: Okay well I really like the house but I think that The price of $235,000 is a bit excessive especially considering um the prices of some homes that are nearby The house I’m interested in that are selling for a lot less than that Um So I would definitely want to negotiate the price Um
Seller: Yeah How much how much was asking price again I believe it was 240
Buyer: Okay I think that a fair price would be around 218,000 Just considering other houses in the area
Seller: Um But like we also have like houses newly decorated we have like two fireplaces We also have a large eat in kitchen with all the appliances And uh comparing we all the house has uh 1,846 sq ft of space and which is more than the other first listing in appendix two

Another notable observation is that buyer-only bargaining acts are more predictive.To better make sense of this observation, Table2(b) shows important features when predicting only with buyer bargaining act unigrams and bigrams. Most notably, new offers and pushes followed by comparisons consistently appear as two of the most influential features predictive of buyer wins.

We present an example excerpt in Table 4 to illustrate such sequences.In this case, the comparison is serving the role of justifying the buyer’s new offer of $218,000.This scenario often occurs the first time that a comparison is made by either party: It puts the seller in a position to defend their offer and provide counter-evidence in favor of dismissing the buyer’s offer.Notably, the buyer remains clear and focused in their comparison to other comparable houses. In contrast, when the seller responds, they invoke small details to attempt to justify their original price. This defensive and overly complex response weakens their bargaining position because the relative importance of these minute details may be debated and new evidence may be introduced by the buyer to further discount the seller’s position.This conclusion complements the finding that, in contrast to the seller, the buyer is advantaged when the seller discusses details of the property, as evidenced by the LIWC feature “s2-space”.

Further Evaluation.

As an additional experiment, we train a logistic regression model on the CraigslistBargain dataset (He etal., 2018) and test it on our dataset. We include seller and buyer text, and use the same text encoding procedure described in §5.1. In the CraigslistBargain dataset, the seller asking price is considered to be the seller’s private value for the item being sold and the buyer’s private value is separately specified. We consider the negotiation to be a seller win if the agreed price is higher than the midpoint between the two private values and a buyer win otherwise. Despite CraigslistBargain having a significantly larger training dataset, the maximum test accuracy across all 10 fractions of our negotiations dataset is 54%, whereas we achieve a maximum of 60% accuracy when we train and test on our dataset. This experiment underscores the distinctiveness of our dataset and suggests that it may contain relevant linguistic differences to other datasets within the bargaining domain.

6 Conclusion

In this work we design and conduct a controlled experiment for studying the language of bargaining. We collect and annotate a dataset of alternating offers and natural language negotiations. Our dataset contains more turns per negotiation than existing datasets and, since participants communicate orally, our setting facilitates a more natural communication environment.Our dataset is further enhanced with annotated bargaining acts.Our statistical analyses and prediction experiments confirm existing findings and reveal new insights.Most notably, the ability to communicate using languageresults in higher agreement rates and faster convergence.Both sellers and buyers benefit from maintaining an active role inthe negotiation and not being reactive to the other party.

Limitations

We note several important limitations of this work. Perhaps most importantly, our dataset is "naturalistic," but not actually "natural" in the sense of independently occurring in the world. Though the interactions between our participants are real, the task itself is ultimately artificially constructed. In a real-world negotiation over something as valuable and significant as a house, the negotiating parties will be much more invested in the outcome than our experimental participants, whose actions change their outcome to the order of a few dollars. This difference in turn could lead real-world negotiating parties to speak differently and possibly employ substantially different strategies than we observe.

Methodologically, our study has a few limitations. Firstly our analyses are based entirely on language that has been automatically transcribed (with some manual checks), and while this helps with expense and scale, these transcripts could be missing important subtleties that influence the outcome. Koenecke etal. (2020) uncover an important limitation of these systems, finding significant racial disparities in the quality of ASR transcriptions. The linguistic feature analysis we perform should be treated as largely exploratory, and provides suggestive and correlational rather than causal evidence for the relationship between language in the interactions and negotiation outcomes.

Lastly, there are further linguistic and interactional phenomena at play that we have not yet integrated into the analysis. For one, we have access to the audio channel of participants’ actual speech, but we have not analyzed it in this work. There could very well be acoustic cues in participants’ speech that are as significant to the interactions as the textual features analyzed here, particularly speech prosody which has been shown to communicate social meanings that could be highly relevant to negotiation like friendliness Jurafsky etal. (2009). This particularly extends to more interactional questions of not simply who said what, but what was said in response to what and in what way. For instance, existing research has shown that acoustic entrainment in dialog (e.g., interlocutor adaptation to one another in terms of prosody) has important social associations with dialogue success Levitan etal. (2012). We leave a deeper investigation of these phenomena for future work.

Broader Impacts

This research, collectively with prior and future related work, has the potential to advance our understanding of negotiation, a ubiquitous human activity.Our dataset can enable future research into the dynamics of human bargaining as well as interpersonal interactions more broadly.By employing the findings and insights gained from such research, individuals mayenhance their ability to negotiate effectively in various settings, such as salary negotiations, personal relationships, and community initiatives.Meanwhile, we must acknowledge that while a better understanding of language as an instrument in social interaction can be empowering, it may also be used as a tool for manipulation.

Acknowledgements

We are grateful to Jessica Halten for helping us run the experiment through the Yale SOM Behavioral Lab. The experiment also would not have been possible without the excellent study session coordination by Sedzornam Bosson, Alexandra Jones, Emma Blue Kirby, Vivian Wang, Sherry Wu, and Wen Long Yang. We thank Rajat Bhatnagar for developing the web application used in the study. The human subjects experiment in this research was deemed exempt by the Yale University Institutional Review Board (IRB #2000029151). We thank Allison Macdonald and Sammy Mustafa for their effort in the data annotation process. Their work was an invaluable contribution to the success of this research project.We thank all members of the Chicago Human+AI Lab and LEAP Workshop for feedback on early versions of this work. Finally, we thank all anonymous reviewers for their insightful suggestions and comments.

References

Asher etal. (2016)Nicholas Asher, Julie Hunter, Mathieu Morey, Benamara Farah, and StergosAfantenos. 2016.Discourse structure anddialogue acts in multiparty dialogue: the STAC corpus.In Proceedings of the Tenth International Conference onLanguage Resources and Evaluation (LREC’16), pages 2721–2727,Portorož, Slovenia. European Language Resources Association (ELRA).
Ausubel etal. (2002)LawrenceM. Ausubel, Peter Cramton, and RaymondJ. Deneckere. 2002.Chapter 50 bargaining with incomplete information.volume3 of Handbook of Game Theory with EconomicApplications, pages 1897–1945. Elsevier.
Barry (2008)Bruce Barry. 2008.Negotiator affect: The state of the art (and the science).Group decision and negotiation, 17(1):97–105.
Bazerman etal. (2000)MaxH. Bazerman, JaredR. Curhan, DonA. Moore, and KathleenL. Valley. 2000.Negotiation.Annual Review of Psychology, 51(1):279–314.
Beltagy etal. (2020)IzBeltagy, MatthewE. Peters, and Arman Cohan. 2020.Longformer: Thelong-document transformer.CoRR, abs/2004.05150.
Blount (2000)Sally Blount. 2000.Buying a house.Dispute Resolution Research Center.
Chang and Woo (1994)ManKit Chang and CarsonC. Woo. 1994.A speech-act-basednegotiation protocol: Design, implementation, and test use.ACM Trans. Inf. Syst., 12(4):360–382.
Crawford (1990)VincentP Crawford. 1990.Explicit communicationand bargaining outcome.American Economic Review, 80(2):213–219.
Fligner and Killeen (1976)MichaelA. Fligner and TimothyJ. Killeen. 1976.Distribution-free two-sample tests for scale.Journal of the American Statistical Association,71(353):210–213.
Fu etal. (2023)Yao Fu, Hao Peng, Tushar Khot, and Mirella Lapata. 2023.Improving language modelnegotiation with self-play and in-context learning from ai feedback.
Hayes and Krippendorff (2007)AndrewF. Hayes and Klaus Krippendorff. 2007.Answering the callfor a standard reliability measure for coding data.Communication Methods and Measures, 1(1):77–89.
He etal. (2018)HeHe, Derek Chen, Anusha Balakrishnan, and Percy Liang. 2018.Decoupling strategy andgeneration in negotiation dialogues.In Proceedings of the 2018 Conference on Empirical Methods inNatural Language Processing, pages 2333–2343, Brussels, Belgium.Association for Computational Linguistics.
Jeong etal. (2019)Martha Jeong, Julia Minson, Michael Yeomans, and Francesca Gino. 2019.Communicating with warmth in distributive negotiations is surprisinglycounterproductive.Management Science, 65(12):5813–5837.
Jurafsky etal. (2009)Dan Jurafsky, Rajesh Ranganath, and Dan McFarland. 2009.Extractingsocial meaning: Identifying interactional style in spoken conversation.In Proceedings of Human Language Technologies: The 2009 AnnualConference of the North American Chapter of the Association for ComputationalLinguistics, NAACL ’09, page 638–646, USA. Association for ComputationalLinguistics.
Koenecke etal. (2020)Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, ZionMengesha, Connor Toups, JohnR. Rickford, Dan Jurafsky, and Sharad Goel.2020.Racial disparitiesin automated speech recognition.Proceedings of the National Academy of Sciences,117(14):7684–7689.
Lee and Ames (2017)AliceJ Lee and DanielR Ames. 2017.“i can’t pay more” versus “it’s not worth more”: Divergenteffects of constraint and disparagement rationales in negotiations.Organizational Behavior and Human Decision Processes,141:16–28.
Levitan etal. (2012)Rivka Levitan, Agustín Gravano, Laura Willson, ŠtefanBeňuš, Julia Hirschberg, and Ani Nenkova. 2012.Acoustic-prosodicentrainment and social behavior.In Proceedings of the 2012 Conference of the North AmericanChapter of the Association for Computational Linguistics: Human LanguageTechnologies, pages 11–19, Montréal, Canada. Association forComputational Linguistics.
Lewis etal. (2017)Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, and Dhruv Batra. 2017.Deal or no deal?end-to-end learning of negotiation dialogues.In Proceedings of the 2017 Conference on Empirical Methods inNatural Language Processing, pages 2443–2453, Copenhagen, Denmark.Association for Computational Linguistics.
Passonneau (2006)Rebecca Passonneau. 2006.Measuring agreement on set-valued items (MASI) for semantic and pragmaticannotation.In Proceedings of the Fifth International Conference onLanguage Resources and Evaluation (LREC’06), Genoa, Italy. EuropeanLanguage Resources Association (ELRA).
Pruitt (2013)DeanG Pruitt. 2013.Negotiation behavior.Academic Press.
Roth (2020)AlvinE Roth. 2020.Bargaining experiments.In The Handbook of Experimental Economics, pages 253–348.Princeton University Press.
Rubin and Brown (1975)JeffreyZ Rubin and BertR Brown. 1975.The social psychology of bargaining and negotiation.Elsevier.
Swaab etal. (2011)RoderickI. Swaab, WilliamW. Maddux, and Marwan Sinaceur. 2011.Early words that work: When and how virtual linguistic mimicry facilitatesnegotiation outcomes.Journal of Experimental Social Psychology, 47(3):616–621.
Tausczik and Pennebaker (2010)YlaR. Tausczik and JamesW. Pennebaker. 2010.The psychologicalmeaning of words: Liwc and computerized text analysis methods.Journal of Language and Social Psychology, 29(1):24–54.
Twitchell etal. (2013)DouglasP Twitchell, MatthewL Jensen, DouglasC Derrick, JudeeK Burgoon, andJayF Nunamaker. 2013.Negotiation outcome classification using language features.Group Decision and Negotiation, 22(1):135–151.
Weigand etal. (2003)Hans Weigand, Mareike Schoop, Aldo deMoor, and Frank Dignum. 2003.B2bnegotiation support: The need for a communication perspective.Group Decision and Negotiation, 12(1):3–29.
Zhan etal. (2022)Haolan Zhan, Yufei Wang, Tao Feng, Yuncheng Hua, Suraj Sharma, Zhuang Li,Lizhen Qu, and Gholamreza Haffari. 2022.Let’s negotiate! a surveyof negotiation dialogue systems.Working paper. California State University, Northridge, CA.
Zhang etal. (2021)Tianyi Zhang, Felix Wu, Arzoo Katiyar, KilianQ. Weinberger, and Yoav Artzi.2021.Revisiting few-sample bertfine-tuning.

Appendix

Appendix A Negotiation Excerpts

Buyer: Okay well I really like the house but I think that The price of $235,000 is a bit excessive especially considering um the prices of some homes that are nearby The house I’m interested in that are selling for a lot less than that Um So I would definitely want to negotiate the price Um
Seller: Yeah How much how much Where the app was asking price again I believe it was 240
Buyer: Okay I think that a fair price would be around 218,000 Just considering other houses in the area
Seller: Um But like we also have like houses newly decorated we have like two fireplaces We also have a large eat in kitchen with all the appliances And uh comparing we all the house has uh 1,846 sq ft of space and which is more than the other first listing in appendix two

Buyer: My name is [name] Um I am an investor looking to buy a single household family in the neighborhood Um and your house based on the information that I was given seemed like a good option And I was looking at the housing market in the area and it seems like one of the houses that closely resembles your own house has been sold for $213,000 Um so I am interested in buying your house at a price somewhere close to that Uh price
Seller: Okay perfect Um Well um That house that you’re talking about was actually sold quite a while ago so the prices have appreciated quite a bit and now the asking price that we have is $240,000
Buyer: Yeah

Buyer: I do feel like even though I agree it’s a nice area it’s a bit overpriced Um I mean speaking of comparisons the one I’m looking at right now listing 89 I was 6898 The selling price they’re asking for is approximately 213,000 Um it has 1715 square feet And I’ve done the math That’s a difference of 131 sq ft The difference in your asking price And my offering is to 27,000 So that equates to about $206 per square foot Um That’s the difference and I think that’s a reasonable difference to make
Seller: Yeah the market has been weirdly slow around here lately Um So we could come down slightly uh into the high two thirties let’s say 2 39
Buyer: Um I’ll raise it 214
Seller: Mhm Um Right we’re gonna Stick with 239 I think

Appendix B Controlled Experiment

Compensation details summary.

Each subject received $10 for showing up and could earn additional bonus money per round. Bonus earnings depended on bargaining outcomes to incentivize subjects to engage in realistic negotiating behavior.Buyers could earn $1 in bonus for every $1,000 that the agreed sale price was below the buyer’s private value of $235,000, up to a maximum of $10 in bonus money. Sellers could earn $1 in bonus for every $1,000 that the agreed sale price was above the seller’s private value of $225,000, up to a maximum of $10. Given the private values of buyers and sellers, $10 of surplus was available to split. No party earned bonus money in a round if an agreement was not reached.

Appendix C Logistic Regression Features

	Roles	10%	20%	30%	40%	50%	60%	70%	80%	90%	100%
LIWC	Buyer+Seller	266	296	409	547	687	824	962	1105	1244	1381
	Buyer	120	135	205	272	343	412	482	553	622	688
	Seller	146	161	204	275	344	412	480	552	622	693
Transcription Texts	Buyer+Seller	261	589	1052	1522	1979	2420	2385	2423	2397	2375
	Buyer	140	303	519	734	946	1161	1376	1554	1728	1869
	Seller	121	286	533	788	1033	1293	1493	1735	1916	2116
Bargaining Acts	Buyer+Seller	36	65	83	93	98	105	106	108	108	110
	Buyer	12	22	26	27	28	29	29	29	29	30
	Seller	1 4	23	29	32	33	33	33	33	33	33

Appendix D Hyperparameters

Features	n-gram	Inner/Outer k-Folds	Max Iterations	$\ell_{2}$ Coefficient
Numerical/BOW/LIWC	1	5	10k	$\{2^{x}\|x\in\{-10,-9,\cdots,0,1\}\}$
Bargaing Acts	2	5	10k	$\{2^{x}\|x\in\{-10,-9,\cdots,0,1\}\}$

Model	Speaker Role	k-Folds	Max Epochs	Batch Size	Optimizer	Learning Rate
LED	Seller + Buyer	5	20	4	AdamW	5e-5

[h]Percent (%)Percent (%)GenderEmployment StatusMale38.01Employed, full time (40+ hrs/wk)20.86Female60.23Employed, part time (up to 39 hrs/wk)9.55Other1.75Unemployed, looking for work3.12AgeUnemployed, not looking for work0.3918-2463.74Student65.3025-3427.49Homemaker0.1935-444.09Self-employed0.5845-541.75Income55-642.34$014.1565-740.58$1-$9,99940.67Hispanic13.45$10,000-$24,99914.54Race$25,000-$49,99913.16American Indian or Alaska Native2.15$50,000-$74,99910.22Asian35.23$75,000-$99,9994.52Black or African American14.29$100,000-$149,9991.38Native Hawaiian or Other Pacific Islander0.20$150,000+1.38White48.73Risk PreferencesEducation0 (unwilling to take risks)0.00< High School0.0011.36High School or GED12.0927.41Some college, no degree36.26314.62Associate degree2.14415.40Bachelor’s degree31.19513.06Master’s degree15.20615.79Doctorate or professional degree3.12715.98Marital Status89.55Single (never married)84.8093.51Married or domestic partnership13.4510 (very willing to take risks)3.31Divorced1.75No. of subjects w/ demographic info.513No. of subjects521

•
Notes. This table reports select demographic attributes of study subjects. Attributes were collected from a survey of subjects prior to the start of each study session. Responses were voluntary. Participants were allowed to select multiple choices for Race. All other attribute questions allowed only a single choice response. Risk preferences were elicited from the question: “Are you generally a person who is willing to take risks or do you try to avoid taking risks?” Respondents rated themselves on a ten-point scale from 0 (unwilling to take risks) to 10 (very willing to take risks). The percentage of respondents in each demographic category is reported, except for the number of subjects, which are the raw counts of the number of participants in the experiment across all study sessions for whom we have demographic information and the number of experiment participants in total.

See pages - of ./material/recruitment.pdfSee pages - of ./material/script.pdfSee pages - of ./material/buyer.pdfSee pages - of ./material/seller.pdf