Tag: Methodology

Field Notes from 2018’s Adventures in Applied Linguistics

Happy Birthday to us! We’ve been doing the bish thing for a year, so I guess we have to do that tired old practice of recapping because like Kylie, we had a big year.

TL;DR – following is a list of our plans for 2019 and a recap of what we learned in 2018.

This is a still from Kylie Jenner's 2016 New Year Resolutions video. It shows her head and shoulders with the quote "like, realizing things..."
This is a still from Kylie Jenner’s 2016 New Year Resolutions video. It shows her head and shoulders with the quote “like, realizing things…”

#goals

    1. We’re looking for guest writers. So if you know any other linguabishes, send them our way.
    2. We’re diversifying our content to include not just peer-reviewed articles in academic papers, but also conference papers, master’s theses, and whatever else strikes our fancies.
    3. We’re planning to provide more of our own ideas like in the Immigrant v. Migrant v. Expat series (posts 1, 2, and 3) and to synthesize multiple papers into little truth nuggets.
    4. Hopefully it won’t come up, but we’re not beyond dragging any other racist garbage parading as linguistics again.

Plans aside, here’s all the stuff we learned. We covered a lot of topics in 2018, so it’s broken down by theme.

Raciolinguistics and Language Ideology

We wrote 5 posts on language ideology and raciolinguistics and we gave you a new word: The Native-speakarchy. Like the Patriarchy, the Native-speakarchy must be dismantled. Hence Dismantling the Native-Speakarchy Posts 1, 2, and 3. Since we had a bish move to Ethiopia, we learned a little about linguistic landscape and language contact in two of its regional capitals. Finally, two posts about language ideology in the US touch on linguistic discrimination. One was about the way people feel about Spanish in Arizona and the other was about Spanish-English bilingualism in the American job market. 

This is a gif of J-Lo from the Dinero music video. She’s wearing black lingerie and flipping meat on a barbecue in front of a mansion. She is singing “I just want the green, want the money, want the cash flow. Yo quiero, yo quiero dinero, ay.”

Pop Culture and Emoji

But we also had some fun. Four of our posts were about pop culture. We learned more about cultural appropriation and performance from a paper about Iggy Azalea, and one about grime music. We also learned that J.K. Rowling’s portrayal of Hermione wasn’t as feminist as fans had long hoped. Finally, a paper about reading among drag queens taught that there’s more to drag queen sass than just sick burns.

Emojis aren’t a language, but they are predictable. The number one thing this bish learned about emojis though is that the methodology used to analyze their use is super confusing.

This is a gif of of the confused or thinking face emoji fading in and out of frame.

Lexicography and Corpus

We love a dictionary and we’ve got receipts. Not only did we write a whole 3-post series comparing the usages of Expat v. Immigrant v. Migrant in three different posts (1, 2, and 3), but we also learned what’s up with short-term lexicography, and made a little dictionary words for gay men in 1800’s.

Sundries

These comprise a grab bag of posts that couldn’t be jammed into one of our main categories. These are lone wolf posts that you only bring home to your parents to show them you don’t care what they think. These black sheep of the bish family wear their leather jackets in the summer and their sunglasses at night.

This is a black and white gif of Rihanna looking badass in shades and some kind of black fur stole.

Dank Memes

Finally, we learned that we make the dankest linguistics memes. I leave you with these.

 Thanks for reading and stay tuned for more in 2019!

Read More

Paper Drags: Do Linguistic Structures Affect Human Capital?

In case you’ve been off linguistics Twitter for the last week, you should know that it coniptioned last Wednesday. This is what happened.

A study was dropped (ya, academics drop papers) that claimed that in countries where the dominant language allowed pronouns to be omitted, education suffered.

There were a lot of hot takes with linguists sashaying into Twitter for an opportunity to drag this quote unquote study.

TL;DR: the study ignores current work in the field, doesn’t collaborate, uses sloppy methods, and arrives at biased results.

Here are the problems with it as mined from Twitter:

Research:

This study by Horst Feldmann (2018) is not based on current research in linguistics. The “recent” research in that is referenced in the introduction is a baloney economics study from 2013 by M. Keith Chen. It was dragged in its own time for its interpretation of the now infamous Theory of Linguistic Relativity.

Theory of Linguistic Relativity: This a nearly century old study that claimed that an individual’s thoughts are restrained by the languages they speak. It is also known as Whorfianism.

A heck-ton of studies over the last 100 years have attempted to prove or disprove this theory. These days, linguists generally accept that language does or might have some effect on thought, but that we’re not quite sure how large that affect is or might be. I’m not going to get into it here, but if you want to learn more, get reading!

Feldmann, like Chen before him, ran with what we call the strong version of the hypothesis. He boldly claims that “…language shapes speakers’ mental representation of reality…” which it doesn’t. If Feldmann had studied linguistics, he would have known that.

This leads us to the second major issue:

Author expertise:

@gretchenmcculloch compared this type of study to a linguist writing an economics paper. @sesquiotic pointed out that the study was not even co-authored by a linguist. He tweeted that the study has a “crib-toy use of linguistics” and that its chain of reasoning and supposition is patently problematic.

This is all a part of the invisibility of the linguistics field. @adamCSchembri pointed out that somehow, linguists aren’t considered experts by academicians in other fields.

But since Feldmann went ahead and decided to act the linguist anyway, let’s look at his premise:

The premise:

The premise of the paper is that there are languages that license the dropping of the pronoun before a verb. That’s true. A common example is Spanish whose speakers could say “yo hablo” (I speak), but can use just the “hablo” part if they want. Ok, so that’s an incredibly overly simplified explanation, but that’s for another time.

What Feldmann got wrong was claiming that English does not license the dropping of the pronoun. Actually speakers of English do it all the time. For example “do you speak English?” “Sure do!” or “Guess so.”

Yep, that’s pronoun drop. So the premise is wrong. This brings us to the bad linguistics of it all:

Bad linguistics:

@sesquiotic: the study doesn’t include actual linguistics and makes some pretty big claims about linguistics.

This paper is full of bad linguistics so here’s a list of a few that came up on Twitter:

  1. Misspelling hablo as ablo
  2. Studying 103 languages, but not mentioning which ones
  3. Mentioning spoken language, but not including any in their data
  4. Not defining or citing language variables used in regression tables
  5. Grouping together languages without acknowledging language families
  6. Using English example sentences that no one has ever uttered (I speak)
  7. Claiming that V-S-O languages are the most common, but not backing it up with evidence
  8. Referencing “ancient cultural values” and the “distant past” without defining what those things are or researching language history

@eviljoemcveigh: the linguistics is garbage so regression methods, covariates, and other statistical decisions are uninformed.

What do you get when you take an outdated hypothesis, add a false premise, and stir in some bad linguistics?

The conclusion:

Feldmann concludes that dropping a pronoun has a “negative effect of human capital” and that speakers of those languages have less education. Many people on Twitter were reminded of a similar conclusion by the Church of the Flying Spaghetti Monster in an open letter to the Kansas School Board.

The thing is, if you’re not putting in solid research and defined linguistic variables, the conclusion is moot. Feldmann’s conclusion is punching down at countries with less access to education and claiming that no one’s to blame because language. But there are guilty parties in the disparities in education around the world. A linguistics website isn’t the best place to learn about them, but this paper isn’t just bad linguistics, it’s bad anthropology, bad economics, and bad statistics, bad research design, and bad critical thinking.

This bish’s conclusion? Sashay away, Feldmann!

Special thanks to Joe McVeigh (@Eviljoemcveigh), Lee Murray (@MurrayLeeA), Gretchen McCulloch (@GretchenAMcC), James Harbeck (@sesquiotic), and Nic Subtirelu (@linguisticpulse).

Recommended for no bishes!

————————————————————
Feldmann, Horst. “Do Linguistic Structures Affect Human Capital? The Case of Pronoun Drop.” Kyklos: International Review for Social Sciences, 8 Nov. 2018, doi:10.1111/kykl.12190.

Chen, M. Keith (2013). The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets, American Economic Review. 103(2): 690‐731.

Read More

Companion to “Are Emojis Predictable?”

Welcome to the companion to

Are Emojis Predictable?

by  Francesco Barbieri, Migual Ballesteros, and Horacio Saggion.

This is where I’ve attempted to provide some semblance of explanation for the methods of the study. Look, I tried my best with this, so don’t judge. I ordered it in terms of the difficulty had instead of alphabetically. References at the end for thirsty bishes who just can’t get enough.

Difficulty NLP Model or Term
 Grinning Face on Twitter Sentiment Analysis

A way of determining and categorizing opinions and attitudes in a text using computational methods. Also opinion mining.

 Smiling Face on Twitter Neural Network

A computer network that’s based on how the human brain works.

 Slightly Smiling Face on Twitter Recurrent Neural Network

A type of neural network that at can be trained by algorithms and that stores information to make context-based predictions. Also RNN.

 Slightly Smiling Face on Twitter Bag of Words

A neural network that basically counts up the number of instances of words in a text. It’s good at classifying texts by word frequencies, but because it determines words by the white space surrounding them and  disregards grammar and word order, phrases lose their meaning. Also BoW.

 Neutral Face on Twitter Skip Gram

A neural network model does the opposite of the BoW. Instead of looking at the whole context, the skip gram considers word pairs separately. It’s trying to predict the context from a word, so it weighs closer words more than further ones. So the order of words is actually relevant. Also Word2Vec.

 Neutral Face on Twitter Long Short-term Memory Network

A recurrent neural network that can learn the orders of items in sequences and so can predict them. Also LSTM.

 Expressionless Face on Twitter Bidirectional Long Short-term Memory Network

The same as above, but it’s basically time travel because half the neurons are searching backwards and half are searching forwards even if more items are added later. Also BLSTM.

 Downcast Face With Sweat on Twitter Char-BLSTM

A character-based approach that learns representations for words that look similar, so it can handle alternatives of the same word type. More accurate than the word-based variety.

 Confounded Face on Twitter Word-BLSTM

Some kind of word-based variant of the above? Probably?

 Face Vomiting on Twitter Word Vector

Ya, this one is umm… well, you see, it has magnitude and direction. And like, you have to pre-train it. So… “Fuel your lifestyle with .”

Congratulations if you’ve made it this far! You probably already know more than me. Scream it out. I know I did 🙂

 


 

REFERENCES

Bag of Words (BoW) – Natural Language Processing, ongspxm.github.io/blog/2014/12/bag-of-words-natural-language-processing/.

Britz, Denny. “Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs.” WildML, 8 July 2016, www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/.

Brownlee, Jason. “A Gentle Introduction to Long Short-Term Memory Networks by the Experts.” Machine Learning Mastery, 19 July 2017, machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/.

Brownlee, Jason Brownlee. “A Gentle Introduction to the Bag-of-Words Model.” Machine Learning Mastery, 21 Nov. 2017, machinelearningmastery.com/gentle-introduction-bag-words-model/.

Chablani, Manish. “Word2Vec (Skip-Gram Model): PART 1 – Intuition. – Towards Data Science.” Towards Data Science, Towards Data Science, 14 June 2017, towardsdatascience.com/word2vec-skip-gram-model-part-1-intuition-78614e4d6e0b.

Verwimp, et al. “Character-Word LSTM Language Models.” [1402.1128] Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition, Cornell University Library, 10 Apr. 2017, arxiv.org/abs/1704.02813.

Colah, Christopher. “Understanding LSTM Networks.” Understanding LSTM Networks — Colah’s Blog, colah.github.io/posts/2015-08-Understanding-LSTMs/.

Nielsen. “Neural Networks and Deep Learning.” Neural Networks and Deep Learning, Determination Press, 1 Jan. 1970, neuralnetworksanddeeplearning.com/chap1.html.

“Sentiment Analysis: Concept, Analysis and Applications.” Towards Data Science, Towards Data Science, 7 Jan. 2018, towardsdatascience.com/sentiment-analysis-concept-analysis-and-applications-6c94d6f58c17.

gk_. “Text Classification Using Neural Networks – Machine Learnings.” Machine Learnings, Machine Learnings, 26 Jan. 2017, machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6.

Thireou, T., and M. Reczko. “Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins.” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4, no. 3, 2007, pp. 441–446., doi:10.1109/tcbb.2007.1015.

“Vector Representations of Words  | TensorFlow.” TensorFlow, www.tensorflow.org/tutorials/word2vec.

“Word2Vec Tutorial – The Skip-Gram Model.” Word2Vec Tutorial – The Skip-Gram Model · Chris McCormick, mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/.

Read More