We now apply the structural topic models to a real-world example: Parliamentary debates.
Topic models are often used in two different ways: (1) to determine broad topics in general corpuses, or (2) to determine ‘framing’ and different ways to speak about a topic / highlight aspects in thematically more narrow corpuses. We try to do the later:
We first fix the parliamentary debate corpus to be a bit more homogenous - we do this by using the agenda item variable and selecting only speeches about the European Union Withdrawal Bill.
We also discard the chair since its role is mostly procedural and (because we’ll use the party variable later on) all speakers for whom party is missing.
Instead of doing this (which again needs some computational power), you can also load the pre-processed data from the course webpage.
Corp_HouseOfCommons_V2 <- readRDS("../data/Corp_HouseOfCommons_V2.rds")
brexit_bill <- Corp_HouseOfCommons_V2 %>%
filter(agenda=="European Union (Withdrawal) Bill") %>%
filter(chair==F) %>%
filter(is.na(party)==F)
save(brexit_bill,file="../data/brexit.RData")
Now, we create a corpus and do some pre-processing.
Generally, we can apply similar preprocessing steps as with all other models. However, some studies have discouraged the use of extensive pre-processing for topic models. However, there is no final word on this.
The reason I still do some pre-processing is pragmatic: Since topic models look for clusters of words, I recommend removing overly scarce terms - e.g. words that occur only a single time. But this primarily helps to reduce computing time. The same of course goes for punctuation.
Additionally, removing overly frequent terms and stopwords will improve the interpretability of the topic labels - so we also do that here. You may also want to use stemming but opinions on that differ.
load("../data/brexit.RData")
brexit_corpus <- corpus(brexit_bill)
brexit_dfm<- dfm(brexit_corpus,
remove_punct=T,
remove_numbers=T) %>%
dfm_remove(stopwords("en")) %>%
dfm_wordstem("en") %>%
dfm_trim(min_docfreq=0.001, max_docfreq=0.8,docfreq_type = "prop",verbose=T)
## Warning: 'dfm.corpus()' is deprecated. Use 'tokens()' first.
## Warning: '...' should not be used for tokens() arguments; use 'tokens()' first.
## Removing features occurring:
## - in fewer than 4.437 documents: 10,185
## Total features removed: 10,185 (74.5%).
In the past, you had to convert a dfm into stm input using quanteda’s convert()
function. Now, stm can also directly work with dfms as input.
Sometimes, it is still preferable to convert a dfm into an stm explicitly so you know what happens - for example, stm removes documents with no features and you may not be able to use some of the evaluation functions if the number of documents differs between the original texts and your dfm.
However, you can also remove empty documents from your dfm to deal with this issue. I’ve copied (and commented out) two lines of code do this here.
keep <- ntoken(brexit_dfm) > 0
brexit_dfm <- dfm_subset(brexit_dfm,keep)
Now, run the the actual model on this - you can choose the number of topics freely.
stm_1 <- stm(brexit_dfm,K=20,seed = 2020,verbose=F)
save(stm_1,brexit_bill,keep,brexit_dfm,file="../data/stm_brexit1.RData")
If your model converges too slowly, load stm_brexit1.RData
to continue.
Now, have a look at the topics using labelTopics()
. Do the topics make sense to you? Do you find the descriptors coherent? Which one do you find more easy to make sense of?
Also, pick a topic you find interesting and keep the number in mind.
labelTopics(stm_1)
## Topic 1 Top Words:
## Highest Prob: agreement, deal, govern, negoti, minist, withdraw, transit
## FREX: transit, period, date, withdraw, implement, negoti, articl
## Lift: straitjacket, dissent, florenc, assent, two-year, earliest, transit
## Score: transit, agreement, dissent, period, negoti, withdraw, florenc
## Topic 2 Top Words:
## Highest Prob: ireland, northern, agreement, border, good, friday, hard
## FREX: friday, ireland, irish, northern, belfast, republ, border
## Lift: british-irish, dodd, ira, irish, republ, scare, sinn
## Score: ireland, northern, border, friday, belfast, republ, irish
## Topic 3 Top Words:
## Highest Prob: right, protect, work, children, uk, worker, equal
## FREX: children, child, famili, young, women, worker, employ
## Lift: child, women, asylum, asylum-seek, aunt, children, expens
## Score: children, expens, worker, child, famili, refuge, women
## Topic 4 Top Words:
## Highest Prob: power, claus, bill, minist, amend, act, use
## FREX: defici, power, statut, henri, correct, limit, viii
## Lift: bought, restraint, circumscrib, exhaust, synonym, defici, œnecessaryâ
## Score: power, claus, defici, henri, statut, viii, deleg
## Topic 5 Top Words:
## Highest Prob: union, market, trade, singl, custom, deal, european
## FREX: custom, singl, eea, market, stay, money, billion
## Lift: efta, frequent, protectionist, purchas, surplus, trader, inferior
## Score: custom, market, trade, eea, singl, £, frequent
## Topic 6 Top Words:
## Highest Prob: environment, govern, principl, environ, secretari, polici, state
## FREX: environment, indic, environ, air, bodi, enforc, water
## Lift: œpollut, pavilion, paysâ, zombi, 25-year, defra, greener
## Score: indic, environment, environ, enforc, bodi, watchdog, pollut
## Topic 7 Top Words:
## Highest Prob: uk, need, framework, work, administr, govern, devolv
## FREX: framework, administr, common, meet, joint, jmc, engag
## Lift: justifi, communiquã, multilater, jmc, inter-parliamentari, darlington, timescal
## Score: justifi, devolv, framework, administr, jmc, uk, joint
## Topic 8 Top Words:
## Highest Prob: hous, bill, vote, govern, time, debat, second
## FREX: meaning, motion, hour, second, vote, read, hous
## Lift: maastricht, churchil, eight, wave, king, gainsborough, leigh
## Score: eight, vote, motion, meaning, bill, hous, read
## Topic 9 Top Words:
## Highest Prob: scottish, govern, scotland, power, devolv, devolut, uk
## FREX: wale, scotland, scottish, devolut, welsh, snp, settlement
## Lift: withhold, alan, carwyn, dewar, fabric, lcm, murray
## Score: scottish, scotland, devolv, devolut, welsh, wale, nonsens
## Topic 10 Top Words:
## Highest Prob: right, hon, friend, learn, member, amend, issu
## FREX: learn, beaconsfield, griev, rushcliff, friend, sir, clark
## Lift: halfway, hammond, wimbledon, griev, unwis, beaconsfield, rushcliff
## Score: learn, friend, beaconsfield, hon, right, griev, amend
## Topic 11 Top Words:
## Highest Prob: right, court, charter, law, case, principl, protect
## FREX: charter, human, court, francovich, fundament, case, incorpor
## Lift: adjud, beano, o'brien, overridden, tobacco, charter, common-law
## Score: charter, court, moral, human, law, data, right
## Topic 12 Top Words:
## Highest Prob: €, â, say, minist, said, word, œthe
## FREX: €, â, œthe, œwe, thatâ, mine, said:â
## Lift: marr, mine, œbut, peopleâ, œi, œno, journalist
## Score: €, â, mine, œthe, œappropriateâ, œwe, œi
## Topic 13 Top Words:
## Highest Prob: eu, uk, regul, standard, industri, sector, anim
## FREX: industri, anim, sector, welfar, six, chemic, export
## Lift: databas, electr, just-in-tim, medicin, motor, six, evalu
## Score: six, anim, industri, chemic, sector, welfar, export
## Topic 14 Top Words:
## Highest Prob: peopl, vote, leav, brexit, referendum, constitu, countri
## FREX: referendum, democraci, campaign, peopl, constitu, elect, voter
## Lift: alland, blue, fraser, heal, remoan, talent, regain
## Score: referendum, vote, peopl, blue, immigr, democraci, constitu
## Topic 15 Top Words:
## Highest Prob: eu, law, claus, legisl, retain, exit, uk
## FREX: retain, domest, law, eu, exit, oblig, uncertainti
## Lift: god, sweeper, pre-exit, law-i, post-exit, eu-deriv, convert
## Score: law, eu, domest, retain, god, exit, claus
## Topic 16 Top Words:
## Highest Prob: legisl, committe, hous, procedur, scrutini, chang, statutori
## FREX: scrutini, procedur, statutori, instrument, secondari, primari, committe
## Lift: negat, si, sift, sis, broxbourn, affirm, blanch
## Score: negat, instrument, procedur, scrutini, sift, statutori, legisl
## Topic 17 Top Words:
## Highest Prob: european, parliament, decis, sovereignti, lord, british, constitut
## FREX: sovereignti, judg, delight, judiciari, decis, sovereign, adequaci
## Lift: argyl, hale, delight, digit, primaci, v, baro
## Score: delight, judiciari, adequaci, judg, data, sovereignti, court
## Topic 18 Top Words:
## Highest Prob: hon, way, give, point, gentleman, friend, make
## FREX: give, gentleman, way, point, hon, agre, ladi
## Lift: give, gentleman, oh, ah, interrupt, intervent, pool
## Score: give, gentleman, hon, way, friend, ladi, agre
## Topic 19 Top Words:
## Highest Prob: think, govern, thing, go, much, question, might
## FREX: think, quit, thing, tri, answer, worth, bit
## Lift: worth, googl, hidden, humbl, rees-mogg, apolog, somerset
## Score: worth, assess, think, humbl, probabl, answer, thing
## Topic 20 Top Words:
## Highest Prob: member, amend, claus, new, support, tabl, hon
## FREX: support, east, name, rise, tabl, speak, new
## Lift: rise, lesli, nottingham, inclin, fife, gethin, east
## Score: rise, amend, claus, new, support, member, tabl
Plot the distribution of topics. Is your topic frequent in the corpus? If you find the plot difficult to understand, you can distribute the topics over several plots (by specifying topics=1:n
) or adjust the canvas.
par(bty="n",col="grey40",lwd=5)
plot(stm_1,topics=1:10)
plot(stm_1,topics=11:20)
Now, use findThoughts()
to read some of the most characteristic documents for that topic. Do you feel fit the topic?
You can increase the number of articles by setting a higher n.
findThoughts(stm_1,
brexit_bill$text[keep],topics=14,n=15)
##
## Topic 14:
## Will the hon. Gentleman hear me out? The supremacy of Parliament is a proud tradition that all of us should defend. I find it perplexing that, for example, the hon. Members for North East Somerset (Mr Rees-Mogg) and for Blackley and Broughton (Graham Stringer), both of whom I know to be thinking people, are so eager to see us leave the EU that they forget everything else in its path. Democracy matters, and whoever tries to suspend democracy to enact the will of the people should think again. The will of the people is of course a pretty mixed bag and is not fixed forever. On 23 June last year, almost 70% of my constituents voted to remain in the EU. In June this year, I was elected on the basis of my opposition to the Government's Brexit line. That was the will of the people in my constituency at that point. True to form, Bath had one of the highest voter turnouts, and active engagement in Bath is not limited to election time; it is evident every day. Protest groups, demonstrations and lively debates are testament to how much people in Bath care about how our country is run. Another principle of democracy that they want to see practised is that I can speak on their behalf about their concerns about when and how we leave the EU without being labelled as a remoaner, a reverser, unpatriotic or undemocratic. Democracy is about the right to debate freely and voice an opinion without being labelled or bullied. If we truly want to achieve the best for our country, we need to be able to discuss all outcomes freely, including that people-leavers or remainers-can change their mind. The Bill adds another level of madness to the Brexit process, betraying not only those who voted remain, but those who chose to leave. One of the leave campaign's strongest arguments was about taking back control here in Westminster, but instead of giving control back to this Parliament, which the leave campaign championed, the Bill is a power grab by Ministers. One of my constituents said to me:“When people voted to leave the European Union, they didn't vote to swap backroom deals in Brussels for more of the same in Whitehall. They voted for Parliament and the British people to have more of a say.â€As the MP for Bath, I will fight this attack on our democracy. I will not sit idly by as this Government try to erode our rights and change our laws behind closed doors. How can anybody support this Bill? My only conclusion is that those who support it want their version of Brexit at any cost, including democracy. Come on, let us stand up for democracy and stop this flawed Bill in its tracks. I dare say that the will of the people will be right behind us.
## I happily add to the record. It makes some people's circumstances more difficult, but I said that generally speaking, the larger the Labour majority in the general election and the bigger the turnout in the last general election, the one before that and the one before that, the more likely constituents were to vote leave.
## Our choice tonight is clear. Do we deliver the wish of the electorate or the whim of the unelected? My constituents were very clear in the referendum: 70% voted to leave, and all the constituencies in the Potteries voted to leave. Those people want to hear all the Potteries MPs speak up for their decision, to accept their wisdom and to champion the Brexit that they want to see, and it is disappointing that not all of them have done so at every stage of the Bill. If there is one message that the referendum sent us, it is surely this: that the traditional working-class communities across the United Kingdom will no longer be ignored. The key reason they voted for me and got rid of my Labour predecessor was to ensure that we delivered on Brexit. We must fulfil that promise and reject amendments tabled in the other place. The people of Stoke-on-Trent want Brexit to refresh the parts of Britain that the EU did not effectively reach, and they want a closer policy focus on how local and regional Britain can benefit from a global trading future. That will be possible only if we leave the customs union, which will allow us to pursue our own independent trade policies, making and enhancing our trade links with countries throughout the world. It will cause a crisis of democracy if we fail to deliver the result that people voted for, to get the best out of Brexit from new trade around the world and to reject the Lords amendments. It is also critical that we leave the EEA and regain control of our borders. Immigration and ending the free movement of people was a primary reason for people in Stoke-on-Trent voting to leave the EU. They want us to put in place an effective, fair immigration system that will ensure the number of people coming here is at a manageable level that does not put undue pressures on local services, and that those coming here make a meaningful contribution to our country. It is essential that the House rejects amendments that would keep us saddled to the EEA and the continuation of free movement without any control or say. Nothing will lead the electorate to hold Parliament in contempt quite like Parliament holding the electorate in contempt, but that is precisely what the House of Lords is asking us to do. Instead of delivering for the House of Peers, we should be positive about delivering the people's choice. We must embrace the opportunities that come from taking back control, and, most of all, we must get on with it. The people have given us an instruction to leave the European Union. We must stop those trying to frustrate and sabotage Brexit. This House must obey the British people, and so must the House of Lords.
## I rise to speak to amendment 120. Since I arrived in this place in June and started taking part in the Brexit debate, one thing has intrigued me: have the Prime Minister and many other remain MPs changed their minds? We all know that the Prime Minister supported remaining in June 2016. Has she changed her mind since? This is important because she and her Government use one big argument for pressing on with Brexit: it is the will of the people. Is it? For the Government and the hard Brexiteers, the referendum result is fixed forever. The people cannot change their minds. The Prime Minister and other MPs can change their minds, but the people cannot. As the months go by and the Government's legitimacy for implementing their version of Brexit becomes less and less legitimate, obeying the will of the people becomes the last remaining legitimacy, but nobody bothers to find out what the will of the people is now. Indeed, the last to be asked are the people themselves. Hon. Members are right to say that Britain is a parliamentary democracy, but now we have had a referendum, there is no obvious mechanism for updating, confirming or reviewing the referendum result. The 2017 general election provided no mandate for overturning the referendum result. It is obvious that 650 MPs cannot update, confirm or review the decision taken by 33 million people, but the people themselves can, and the people themselves should be allowed to change their minds, in either direction. There are people now who voted remain who feel that the decision has been taken and the Government should get on with it. There are others who voted leave who fear that they will be let down by politicians who have used them for their own ends. The will of the people is a mixed bag. The Government are legislating for a Brexit in the name of the people. Their problem is that they might find themselves pressing ahead without the people's consent. Last week, Parliament voted to give itself a vote on the deal. This was a welcome step forward, but what started with the people must end with the people. The people must sign off or reject the deal. Only the people can finish what the people have started.
## I would not call the right hon. Gentleman a remoaner, but he is a Liberal Democrat; I am just wondering which bit of the democrat in him does not accept the result of the referendum, that 52% of the country voted to leave and that the Prime Minister made it absolutely clear that we would leave if that is what the people voted for. Let me remind him that 41% of his constituents voted for him, whereas 52% voted to leave the European Union. When is he going to ask for a rerun in his own seat?
## It was a pleasure to listen to the thoughtful and considered speech of the right hon. Member for Don Valley (Caroline Flint). She made some sensible points about immigration, on which I will focus in my remarks. Many Members have spoken in favour of joining the EEA but, as I said briefly at Prime Minister's questions, immigration was one of the most important issues that decided the referendum result, so we need to take that into account. Like the right hon. Lady and my hon. Friend the Member for Brigg and Goole (Andrew Percy), I want immigration, but I want immigration to be controlled by Parliament. I want us to decide that we want people with the skills and talents that will make a contribution and increase this country's wealth, and they will be welcomed as a result. Immigrants themselves often want a properly controlled immigration system, because they know that they will be welcomed, they will be supported and they will not be scapegoated, as happens when we lose control of the system. The voters told us that they do not want a system in which we have no control, or very little control, over who comes to our country.
## I am afraid that the public are not fooled by the motives of people who clearly want to delay, frustrate or overturn the result of the referendum. It is a shame some of them cannot admit it. The shadow Secretary of State said that people had said over a long period of time that if we did this or that, Brexit will be frustrated. May I just suggest to him that he gets out of London, because people around the country feel that Brexit is being frustrated? It is already being frustrated a great deal by this House. So he has this idea that Brexit has not been frustrated, but he needs to get-
## I did not hear what the hon. Lady said, but I am sure that Hansard did, so I will move swiftly on. I say to those on the Treasury Bench, and anybody else who might be listening to this speech, that the profound difference between those people and people like me-right hon. and hon. Members on both sides of the House, right across these green Benches-is that we have accepted the result, although it may break our hearts to do so. That is quite a dramatic statement, but many people are genuinely upset that we are going to leave the European Union. Nevertheless, they have accepted the result even though it goes against everything that they have ever believed in. They have not only accepted the result, but then voted to trigger article 50. One of the things that saddens me as much as it saddens me that we are going leave the European Union-probably more so-is the inability of the people who supported and voted for the leave campaign to understand and respect those of us who were remainers, who voted to trigger article 50, and now genuinely say that we are here to help deliver this result to get the best deal that we can as a country, putting our country before our own views and before our party political allegiances. It may be that some leavers, especially some people in Government or formerly in Government, cannot accept that because unfortunately-I am going to have to say this-they judge people like me by their own standards. For people to say that by tabling an amendment one is somehow trying to thwart or stop Brexit is, frankly, gravely offensive. That level of insult-because it is an insult-has got to stop. People have to accept that there is a genuine desire certainly among people on the Government Benches, and on the Opposition Benches, to try to come together to heal the divide and get the best deal for our country.
## Yes, it did. A 600-page White Paper was also produced a year or so before the referendum, which allowed everyone taking part to be a lot better informed than even the same Scots voters were about the EU referendum. It is also worth reminding ourselves that after what has been described as a disastrous and divisive referendum, the first thing that happened in Scotland was that campaigners from all sides got together in local churches, held services of reconciliation and committed ourselves to working together to make the result work, even if it was not the result that we wanted. In the immediate aftermath of the EU referendum, there was a massive increase in crimes of racial hatred against citizens in this country and elsewhere. That was not the fault of those who voted to leave, but a consequence of how the referendum had been set out and how, for too many people, the campaign was conducted.
## Thank you for allowing me to say a few words in this setting, Mr Speaker. I wish to make it clear that, despite whatever else I may say in this speech, I support this Bill wholeheartedly and I wish it to be a success. Uppermost in my mind when considering the Bill are the ramifications of there not being a Bill. I think about the choice the British people made to leave the EU and I respect it. We made a commitment to act on that instruction and act on it we shall-we will honour that vote. Those who choose to disregard the vote of the British people must answer to the British people. My constituency voted to remain in the EU, but I know that my constituents are democrats who expect me, as their elected Member of Parliament, to ensure that their best interests are served in the light of the outcome and that the result is upheld. Many businesses and individuals in my Stirling constituency are ready to make the best of Brexit.
## We had a referendum on whether or not Britain should leave the EU. That referendum has taken place; that decision has taken place; and Parliament has respected that decision. Despite how individual Members might have voted in that referendum, or on which side we might have campaigned, as a whole Parliament has respected that referendum result. The referendum did not decide how we leave the EU, however, or what the Brexit deal or transitional agreement should be. That is the responsibility now for the Government in negotiations, but also for this Parliament. I point out to Members who claim that somehow we cannot have a parliamentary debate on this because it is an internationally negotiated deal-because, somehow, it is a done deal-that Parliament must be able to have a say in this process and we should trust Parliament to be mature and responsible. A lot of Conservative Members said that if we let Parliament vote on article 50, the sky would fall in because it would somehow stop the Brexit process, rip up the referendum result and get in the way of democracy. But actually, the Members of this Parliament know that we have a responsibility towards democracy. We have a mature responsibility to our constituents to defend the very principles of democracy. That is exactly why many of us, including me, voted for article 50, to respect the referendum result, but we do not believe that we should then concentrate powers in the hands of Ministers to enable them do whatever they like. We have a responsibility to defend democracy and those democratic principles. It is our responsibility as Members of Parliament to have our say and to ensure that we get the best deal for the country, rather than just give our power to Ministers.
## I have cut down my speech, because it is almost the witching hour and the Brexit Minister needs to weave his magic. I represent the town of Hartlepool and the outlying villages. I have about 96,000 constituents, and in the EU referendum, of those who voted, more than 70% voted to leave-the highest percentage in the north-east. Clearly, the vast majority of people in my constituency want Brexit. It is my duty, as their MP, to reflect that opinion, but I believe it would be a dereliction of that duty if I voted to give Ministers executive powers to implement changes to complex and important regulations without recourse to scrutiny by Parliament. Despite all the rhetoric and spin, I do not see voting against this power-grab Bill as blocking Brexit-far from it. As a former union official, I know that if you allow the other side to have it all their way in negotiations you may as well not be in the room. That would not be acting in members' best interests. I believe I am acting in my constituents' best interests by voting to protect the right to hold the Government to account during the Brexit process. To do otherwise would be unacceptable and disrespectful to my constituents.
## I very much agree with the hon. Gentleman. All the things I was talking about can be implemented now to better manage migration while we are part of the EEA, and I support them, but what are the real underlying causes of concern here? Not enough decent affordable housing; a shortage of school places; an NHS in crisis; and not enough well-paid and decent jobs. Let us not pretend that all these problems will disappear or be mitigated if we cease participating in the EEA. As hon. Members have said, they will get worse, because there will be less revenue going to the Exchequer to pay for those things. Those underlying problems are no more the fault of European immigrants now than they were the fault of the Commonwealth citizens who came here in the 1960s and 1970s. Let us make no mistake: people in traditional Labour voting areas were saying exactly the same things about the Windrush generation, about south Asian immigration, and about the likes of my father from west Africa being the cause of our problems way back then, as they do now in respect of EU citizens. Curbing Commonwealth immigration then and ending EU free movement now did not and will not solve these problems, and we know it. That is why Labour Governments have always addressed those problems by properly funding the NHS, by having a national minimum wage, by investing in our schools and so on. That is why I will vote for the amendment tabled by my party's Front-Bench team, and also for Lords amendment 2. A colleague came up to me in the Tea Room yesterday. She represents a seat in the north-west and, to my surprise, she told me that she would also be voting for the Lords EEA amendment. I asked her how come she was doing that. Despite the issues and the challenges that I know that she and many of my colleagues have to deal with in respect of that issue, which I do not have to deal with in my own constituency, she said, “Yes, there are big concerns about immigration, certainly compared with your area, Chuka, but the bottom line is that we have nothing like the amount of immigration from the EU or from outside the EU as you do in your constituency. I know that the cause of our problems is not that immigration, so I will not go around saying that I agree with any claim that that is the case, because I know what that will do. It won't help us deal with any of these problems, but what it will do is deprive people of jobs.†That is why I say to my Labour colleagues that we should not ignore this issue of immigration, but let us deal with the problems and underlying causes in a Labour way. That is what our history dictates.
## I am aware of the European Parliament, which cannot reject the legislation that is imposed on it by unelected commissioners. This is about re-establishing democracy. The EU has nothing to do with democracy-it is a deficit in democracy. We are taking that back. Opposition Members should celebrate that fact.
## When that was said-it probably was said by one or two campaigners on the remain side during the referendum campaign-it was used as an argument against voting to leave. The reaction of leave campaigners was to dismiss it, saying it was the politics of fear, that people were being alarmist in talking about leaving the single market and that in fact our trading arrangements would remain absolutely unchanged, because the Germans had to sell us their Mercedes. That was the role it played in the referendum campaign.
The stm model allows you to incorporate meta information. This meta information can influence
Pick a meta information from the data set that you think has an impact on the content of the speeches and use it as a prevalence covariate for a model.
If your model converges too slowly, load stm_brexit2.RData
to continue.
stm_2 <- stm(brexit_dfm,K=10,prevalence=~party,seed=2020,verbose=F)
save(stm_2,brexit_bill,keep,brexit_dfm,file="../data/stm_brexit2.RData")
In a second step, look at the Topics and use estimateEffect()
to estimate the effect of your prevalence covariate on a topic where you would think it may have an effect.
labelTopics(stm_2)
## Topic 1 Top Words:
## Highest Prob: want, go, deal, get, make, thing, negoti
## FREX: get, thing, go, much, want, best, deal
## Lift: aunt, conting, humanitarian, refuge, dissent, perpetu, limbo
## Score: get, deal, negoti, thing, want, go, process
## Topic 2 Top Words:
## Highest Prob: union, trade, european, market, singl, custom, eu
## FREX: trade, custom, economi, eea, £, industri, billion
## Lift: forecast, livelihood, alland, american, auster, automot, bus
## Score: custom, trade, market, eea, £, singl, border
## Topic 3 Top Words:
## Highest Prob: peopl, brexit, leav, referendum, mani, countri, constitu
## FREX: referendum, democraci, peopl, british, campaign, june, elect
## Lift: accuraci, cancer, dynam, enemi, guardian, hidden, insecur
## Score: peopl, referendum, british, countri, democraci, immigr, anguilla
## Topic 4 Top Words:
## Highest Prob: bill, power, legisl, committe, claus, withdraw, parliament
## FREX: procedur, scrutini, instrument, statutori, secondari, deleg, withdraw
## Lift: array, œfor, rosi, steadfast, affirm, broxbourn, deleg
## Score: power, legisl, bill, instrument, procedur, statutori, secondari
## Topic 5 Top Words:
## Highest Prob: right, law, protect, court, principl, eu, charter
## FREX: court, charter, human, data, suprem, ecj, justic
## Lift: adjud, beano, degrad, digniti, disappl, expressli, factortam
## Score: charter, law, court, right, human, data, protect
## Topic 6 Top Words:
## Highest Prob: scottish, govern, devolv, scotland, uk, ireland, power
## FREX: scottish, devolv, scotland, northern, administr, devolut, welsh
## Lift: administr, agreement-th, alan, arfon, assembl, bbc, british-irish
## Score: scottish, devolv, scotland, devolut, welsh, ireland, northern
## Topic 7 Top Words:
## Highest Prob: amend, govern, new, claus, state, minist, tabl
## FREX: secretari, environ, amend, tabl, environment, new, report
## Lift: 25-year, gimmick, government-i, lieu, proven, straitjacket, zombi
## Score: amend, environment, govern, tabl, claus, environ, justifi
## Topic 8 Top Words:
## Highest Prob: hon, member, right, friend, way, give, point
## FREX: learn, beaconsfield, friend, griev, hon, gentleman, mr
## Lift: assidu, hammond, that-a, wellingborough, benn, hilari, leed
## Score: hon, friend, right, learn, gentleman, eight, beaconsfield
## Topic 9 Top Words:
## Highest Prob: eu, exit, claus, ensur, uk, day, continu
## FREX: exit, ensur, schedul, certainti, regul, function, requir
## Lift: durat, indetermin, maximis, cut-off, entiti, fee, nonsens
## Score: exit, claus, nonsens, eu, uk, schedul, regul
## Topic 10 Top Words:
## Highest Prob: €, vote, hous, govern, say, â, minist
## FREX: meaning, vote, €, â, tri, thought, someth
## Lift: accompli, conceiv, fait, guilti, halfway, hobson, œbut
## Score: €, vote, hous, â, govern, meaning, lord
est<-estimateEffect(1:10~party,stm_2,docvars(brexit_dfm))
You can see a model summary using summary()
. If you want to see the documentation for the summary command, check ?summary.estimateEffect()
:
summary(est,topics=2:3)
##
## Call:
## estimateEffect(formula = 1:10 ~ party, stmobj = stm_2, metadata = docvars(brexit_dfm))
##
##
## Topic 2:
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.055001 0.003345 16.440 < 2e-16 ***
## partyDUP 0.015895 0.023315 0.682 0.49545
## partyGPEW 0.035928 0.026620 1.350 0.17718
## partyIndependent 0.002418 0.018302 0.132 0.89490
## partyLab 0.069135 0.005451 12.682 < 2e-16 ***
## partyLibDem 0.061499 0.014070 4.371 1.27e-05 ***
## partyPlaidCymru 0.013407 0.025657 0.523 0.60133
## partySNP 0.026690 0.008637 3.090 0.00201 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Topic 3:
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.072426 0.002453 29.527 < 2e-16 ***
## partyDUP 0.064616 0.018963 3.408 0.000661 ***
## partyGPEW 0.025494 0.018417 1.384 0.166347
## partyIndependent -0.012992 0.013301 -0.977 0.328736
## partyLab 0.027058 0.003784 7.151 1.00e-12 ***
## partyLibDem 0.079007 0.010726 7.366 2.09e-13 ***
## partyPlaidCymru -0.007798 0.019412 -0.402 0.687923
## partySNP 0.036809 0.006266 5.875 4.54e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Alternatively, you can directly plot the results using plot()
. If you want to see the documentation for the plot command, check ?plot.estimateEffect()
:
plot(est,"party",topics=2)
plot(est,"party",topics=3)
If you want to go a step further and get a feeling for the quality of the model, try the word and topic intrusion test in oolong
.
You can read about the reasoning for both tests here.
library(oolong)
oolong_test <- create_oolong(stm_1)
oolong_test
oolong_test$do_word_intrusion_test()
oolong_test$lock()
oolong_test
oolong_test2 <- create_oolong(stm_1,brexit_bill$text)
oolong_test2
oolong_test2$do_topic_intrusion_test()
oolong_test2$lock()
#if you don't want to code all topics use this line:
# oolong_test2$lock(force=T)
Take the results with a grain of salt: We didn’t really select a ‘good’ model, we just went with what we had. Still, it shows the challenges involved in itnerpreting topic models. Also, if your performance was low, think about whether pre-processing steps may have influenced the results (e.g. stemming) and that recognizing intruders is far more difficult when our goal is to measure different frames of the same debate, rather than broad topics.