Skip to content

S3: Ep2

SEASON 3 EPISODE 2

Data Cartels and Surveillance Publishing

Over the last years, as the process of conducting research and scholarship has moved more and more online,  it has become clear that user surveillance and data extraction has crept into academic infrastructure in multiple ways. 

For those committed to preserving academic freedom and knowledge equity, it’s important to interrogate the practices and structures of the companies that are collecting and selling this data, and the impacts of this business model on academic infrastructure – and particularly on already marginalized and underfunded scholars and students. 

To help us understand this landscape and its implications, today we are in conversation with Sarah Lamdan, author of the forthcoming book  Data Cartels: The Companies That Control and Monopolize Our Information

Listen Now


Transcript

Safa  

You are listening to the Unsettling Knowledge Inequities podcast, presented by the Knowledge Equity Lab and SPARC – the Scholarly Publishing and Academic Resources Coalition.

Over the last years, as the process of conducting research and scholarship has moved more and more online, it’s become clear that user surveillance and data extraction has crept into academic infrastructure in multiple ways. 

For those committed to preserving academic freedom and knowledge equity, it’s important to interrogate the practices and structures of the companies that are collecting and selling this data, and the impacts of this business model on academic infrastructure – and particularly on already marginalized and underfunded scholars and students. 

To help us understand this landscape and its implications, today we are in conversation with Dr Sarah Lamdan, author of the forthcoming book  Data Cartels: The Companies That Control and Monopolize Our Information. 

Sarah:  My name is Sarah Lamdan. I’m a Professor of Law at the City University of New York school of Law. And I’m based in Long Island city. 

In 2017 I was working as a Librarian. So I’m both a Professor of Law and I’m also a Librarian with a Master’s degree in Legal Information Management. So I was working in a law library and somebody sent me a news article that said: LexisNexis was one of the companies vying for a contract for ICE’s extreme vetting surveillance program. 

ICE is Immigrations and Customs Enforcement. So it’s actually a sub-agency under the Department of Homeland Security, and they’re thought of kind of immigration police, immigration enforcement – and in the era where I learned about the ICE LexisNexis connection, ICE Immigration and Customs Enforcement was involved in a lot of really ethically fraught activities. There was the whole revelation about, you know, child separation, people were really concerned with the kind of human rights boundaries that immigrations and customs enforcement in the United States were crossing. 

And as a Law Librarian, you use LexisNexis all the time. You use the legal platform all the time, and I didn’t understand how Lexis, which to me was a legal research platform, could be involved in surveillance, government surveillance and ICE surveillance – and that kind of opened up this line of questions and research that five years later is culminated in a book about data analytics companies like LexisNexis, Reed Elsevier LexisNexis.

So when I started researching, I wasn’t aware of the fact that LexisNexis was a legal information platform. You know, part of this duopoly that Thomson Reuters and ArielX share, where they are the main legal information providers in the United States. Then I learned that Reed Elsevier LexisNexis is also the umbrella company for Elsevier, which is the biggest academic information and research company in the world. 

And then also I learned that LexisNexis is one of several major government data brokers that sells our personal information to the government. And they’re also a major financial data provider that provides data to, you know, banks and insurance companies and other financial institutions.

And I started to see that ArielX and a few other major data companies were occupying and kind of monopolizing multiple information markets. And I also saw how companies like Reed Elsevier LexisNexis and Thompson Reuters kind of stifle competition, and kind of act lock step with the other members of their kind of monopolizing group of companies and they behave kind of like cartels that either, you know, implicitly –  it’s always implicit, but, act in concert to stave off competition and to kind of maintain and gain more control of big information markets, like academic information, legal information, personal data. 

So my book discusses the dangers of letting several data analytics companies monopolize and act like cartels in information markets.

It’s kind of complex because these companies do different things in different markets, right? And each of the different markets has different names. Like the data brokering market, we usually think of that being like dossiers or bits and pieces of our personal data, right. Companies that sell facial recognition data or geolocation data, or, you know, just robust data dossiers with public and private records and social media records.

That whole kind of personal data industry is oftentimes called the data brokering industry, but we don’t call academic information platforms, we don’t call them academic data brokers. Right. We call them research platforms or even academic journals. Right. And legal research platforms are computer assisted legal information services. Right? 

So each of the markets has different names. But really when you get down to it, what all of these markets are doing and what I think all of these companies have in common is that they’re all selling information, whether it’s personal information, academic information, legal information, and data – data points about all of those different things, to different consumer groups, right?

To academics, to lawyers, to surveillance enterprises. 

So these companies are all information and data purveyors across multiple markets. Regardless of what kind of jargony names we give each of those markets. 

Safa: In order to better understand how we got here, it’s important to note how the shift of academic information from paper resources to digital platforms has opened the door to a decrease in privacy and increase in surveillance practices.

Sarah: The transition of academic information from paper resources like journals, you know, in stacks and shelves and libraries to these digital platforms has opened up more informational opportunities for companies like Reed Elsevier LexisNexis, and all of the companies that are vending and selling this kind of information.

Because platforms – they can provide information and structure information and create databases that can be searched in certain ways, but they can also collect information, right. They can collect user information, So these platforms are these walled garden infrastructures where users have to log in, right. In order to fully use ScienceDirect or fully use LexisLaw, you have to use a password or affiliate with an institution that somehow is an identifier for you, whether it’s your personal identifier, you know, like Sarah Lamdon, or whether it’s your institutional identifier, CUNY School of Law. Right. You have to flag who you are. 

And then as you move around the database, you click on things, you look at things, you download things. You know, you click on hyperlinks. All of that data about you can be tracked and collected. And that data is really valuable. Personal data is really valuable, right?

It can be used to evaluate the impact of certain journals, you know, which journals are getting more clicks, which information is more popular. But it can also be, you know, used to identify you as a researcher, right? This is what you research. This is who you affiliate with. And those, you know, that’s just two examples of the way your information is valuable.

So companies can now take all of your information and make that into a whole new product to sell – whether it’s impact factor projections or suggestions to institutions about what kind of research they should fund, or which researchers are the most productive. 

But also information about you, you know, are you a good bet for tenure? What institution should hire you? Different evaluations about you as well as research and the institutions themselves. So that information can be sold to a whole new array of consumers. It can be sold to tech firms and pharmaceutical firms and other companies that are investing in research and profiting from research, and it can be sold to academic institutions and grant funders.

So it opens up a whole new venue for consumers and profits. 

Safa: While it’s true that many of us voluntarily sign up to use these platforms, we often don’t fully understand the terms and conditions we are consenting to – nor have a real option to opt out, if we want to continue participating in the academic research ecosystem. 

Sarah: In a sense, we can all assume that when we’re using any platform online, whether it’s a social media platform or research platform, we all, at some point click ascent to some sort of agreement, right. 

We all agree to some terms of service. We click, I agree. Or we fill out some sort of, you know, our names and email addresses or whatever, some other identifying information to make some sort of password for ourselves. So in that sense, like, yes, we, we are notified. But as we all know, those notifications are never clear. They’re never straightforward and there’s really no clear way to see how your data is being collected and how it’s being used. Even with those kinds of terms of services being made available to you or you consenting to them. 

So in some ways, yeah, we probably know that we are being tracked, but we don’t know what that tracking entails. The actual tracking and then the actual use of our data, those are both very opaque processes that don’t have a lot of transparency around them. 

So, Dorothea Salo, who is a researcher at University of Wisconsin, I believe, she actually was able to see kind of the backend of some of the data that was being collected on ProQuest, which is now a Clarivate data analytics entity. 

And she actually filed a freedom of information request in her state to get the kind of backlogs to see what data was being collected about her.

And the interesting part was that you could see all the types of data, like, on the top of the spreadsheet about what kind of data could be collected. And it was pretty invasive. It’s your name, your gender, your address. What you click on, how long you look at it, what time of day you’re clicking on, you know, you’re logging in, your institution, just all sorts of categories of information that could potentially be collected.

Now, a lot of what data gets collected depends on your institution because your institution might determine what kind of data is associated with your student ID and how much data is associated with your student ID – and your student ID might then be the link point between you and your login to ProQuest and other school affiliated platforms.

So it varies from institution to institution, but the capacity is to collect all sorts of identifying information. Depending on how much your school gives them, how much you personally give them in exchange for your login. And one interesting thing that’s starting to happen. You see it happening with Reed Elsevier Lexisnexis as an example, these companies, these data analytics companies are acquiring other companies that provide other data sets. 

So one thing that SPARC has actually been doing great work on is following the acquisitions of companies like Interfolio. So Interfolio is a whole separate product that is used by both grant funders and academic institutions as portals for job applicants and grant recipients to insert all sorts of personal information about themselves. 

So, you know, in order to apply for a job at an institution, you might use Interfolio as your hub, and you might upload your CV into it, all of your letters of recommendation. If you’re using a grant funder that uses Interfolio, you might be directing a lot of data through Interfolio about how you’re spending the grant money and how your grant proposal is being carried out in a step by step way with regular updates.

And now that kind of data also flows through to Elsevier, which is a research platform, right? So it’s not just when you log in now, that your data is going to the same central data hub. It’s also through these other products that companies like Clarivane and companies like Reed Elsevier Lexisnexis are acquiring.

And again, we can’t see how they are using the data we don’t know. Right. Because  it’s very opaque. It’s very non-transparent. They would probably say that whatever algorithms they’re using or whatever processes they’re using are trade secret protected, you know. Like how they develop their impact factors and their data sets and how they use them is their personal, proprietary information also. So we can’t know how data is being used, because there’s no regulation requiring these companies to be transparent. There’s no rules about how these companies use our data or very few rules.

So we’re not sure what’s happening with our data, but we do know it’s coming from us when we log in and do research on the platforms. But they’re also getting our personal data through all sorts of other sources. 

Like if you think about it, Reed Elsevier Lexisnexis, the LexisNexis part of that company, is also a major data broker for ICE. For hundreds of other law enforcement agencies and government agencies. So they also have those data dossiers about all of us, right? 

So they have our academic data over here. They have our data that they’re selling to law enforcement over here. So there’s just a lot of personal data flowing through these companies that also sell us our research products.

Safa: This type of user surveillance and personal data extraction has become a leading business model within academic infrastructure, in North America and more and more so globally. 

Sarah: I think other companies are looking to this model as a good way to proceed in the future.

I mean, one of the things about the publishing industry is that it has suffered over the last few decades. As information has become more readily available online, it has suffered financial issues and, you know, kind of new sources of profit seeking, as its models are changing, right?

Once libraries stop buying paper journals, companies like, you know, Elsevier, Springer, what have you, have to rethink how they’re going to do business and ensure that they’re going to continue to be profitable?  So I think a lot about, and I study this vertical kind of growth into these other data industries that are like flowing through the system. 

But one thing Leslie Chan’s research does is also show this horizontal kind of opening of new information markets. So where Elsevier really used to get its main profits, get its main, you know, income flow at the point of sale. So Elsevier would make a journal, would get these editors to edit the materials, do peer review and then Elsevier would make money when the final journal was available and they could sell that final journal to libraries. But now Elsevier has figured out how to derive profits from the pre publication process.

So platforming preprints on like BPress or other, you know, preprint services. SSRN is now a Reed Elsevier LexisNexis company. So they’ve found ways to profit on the pre-print side and then also to develop those impact factors and those post publication data and metrics services that it can then sell post publication. Right?

So there’s also been this opportunity to profit horizontally in more places than just the point of sale. And I guess this loops back to what I was saying previously about how the companies are figuring out new ways to profit in this changing publication model, you know, in this transition from paper to digital information services. 

So yeah, I think what flows through that horizontal growth that Leslie Chan describes and then also of the acquisitions of Interfolio, of Clarivate purchasing ProQuest to get its data cloud. All of those things have this common thread of using personal data to enrich these information products and to make new information products to sell. 

Safa: You might be asking yourselves, what does this mean for faculty? How alarmed should faculty be by this?

Sarah: A lot of times when I talk to faculty and librarians about this issue, people are like, well, we all use Facebook, right? Our data’s already out there. Who cares, right? Who cares if Elsevier wants to act like Facebook and suck up our data and use our data?

But I think that it’s of a special concern for academic information for several reasons. The first reason is that we are giving a lot of power to these private companies who aren’t scientists and who aren’t experts in various humanities or science fields, to determine what academic enterprises get funded, which get valued for having higher impacts, lower impacts, right?

They assign value to academic and knowledge products that they aren’t experts in. Right. They take a lot of power away from academia to control the academic process and the whole knowledge enterprise, right? Like the way we develop and support academia and academic pursuits.

And then another huge problem with using data this way and academic data this way is it takes kind of the systemic biases built into academia, and it perpetuates them and embeds them into digital systems. Right. So we all know that academia has traditionally ignored the work of people of color, of women, of institutions that aren’t as considered ‘elite’.

So, if you are a white man working at Harvard law school, your scholarship is gonna get more attention than my scholarship, right? At CUNY School of Law, as a woman. And we know that that is even worse if you are a person of color at a historically black college or university, or you’ve published in a journal that has certain terminology in its title versus like Harvard Law Review. 

So these biases don’t disappear in electronic formats – and then also in the kind of data that Elsevire and Clarivate rely on to make metrics and to make these predictions and prescriptions that it’s data analytics products formulate, and then that they can sell to hiring institutions and to all the funding companies and hiring companies in the world. 

That also is really harmful, because when all of this data accumulation and collection, and then all these data analytics happen in an opaque way that’s not transparent, we can’t even see how this bias is happening, what kind of biases are happening and then give advice or instruction about how to stop and break down these biases, and create a better, more equal, and more expansive ecosystem for academia. 

So you know now that when you research on any of these platforms, your data is being collected, there’s no truly unsurveiled private research that can happen in these ecosystems as they currently are formulated. 

So that is problematic for academic and intellectual freedom, right? Just a lack of privacy and then also, as far as academic freedom goes, the more we rely on these platforms and systems to do the work of academia and to do the work of research and to do the work of governing hiring processes and tenure processes and, sorting through who gets higher metrics, who gets lower metrics. The more we concede that to these companies, the less control we have and the less freedom we have, in our work, right? So we’re giving away our privacy and we’re also giving away our ability to control our academic destinies, and to understand how these decisions are being made.

 Safa:  Amongst faculty, library professionals particularly face a tension between their commitment to the privacy of their users and their role in purchasing products and services from these companies. 

Sarah: It’s interesting because along with being a law professor who can say I’m very separate from that, you know, oh, I don’t have to worry about that – I’m also a librarian. So I’m very, intimately and first person aware of how these kinds of problems make it harder for libraries to do their jobs. Right.

It creates a real ethical problem where librarians have to balance providing much needed services to their patrons, and then also ensuring that their patrons have privacy, that people who use their libraries, whether it’s academics or the public, that they also get to maintain their privacy. Cause right now both of those things can’t happen at once. 

And that is a real clash point for librarians. It makes librarians jobs really hard and it presents these really difficult, if not impossible choices to libraries, where they have to decide: am I going to provide these journals that my faculty and that my students rely on?

Or am I going to say” we can’t use these platforms because you are collecting our data and we don’t know if you’re selling that data, or we don’t know how you’re using that data. Right.

It’s really hard, especially I’m a law librarian, so law libraries that rely heavily on Westlaw and Lexis don’t feel like they can just turn around and tell their students and faculties and lawyers, Hey, we’re not gonna use these products anymore. We’re not gonna contract for these products anymore. 

There could be regulations or rules that create those types of safeguards and protections, but right now there are not. 

Safa: Another area of concern is the relationship of these data brokers who operate in the academic publishing space with government entities.  The personal data they sell can contributes to the violation of human rights, racial profiling and other forms of violance and harm. 

Sarah:. ICE has gotten the most attention because their work has really raised a lot of red flags, and their surveillance infrastructure has been used to separate children from their parents and do really harsh, abusive things. And also there’s this movement called No Tech for ICE that’s led by these really excellent immigration advocacy organizations that have done a really good job of shining a light on how ICE uses these systems. 

But these systems aren’t just being used by ICE. They’re being used by local state and federal law enforcement. They’re being used by most government agencies have some sort of LexisNexis special services contract, or other data contract. Right. 

They’re being used by the IRS to sift out who is more likely to commit tax fraud. Being used by the United States postal service, right? Our social media data and, and these brokers to figure out who might be committing mail fraud.

They’re serving as these major data centers for local, state and federal police forces to just pool and use each other’s data and to be able to link and use the data systems that these companies provide. 

They’re being used by insurance companies, tenant screening companies, employment, screening companies, healthcare systems, that are fighting the opioid war, like that’s what they like to say that they are doing. 

So all of these systems, both major public institutions and major private institutions, make really big decisions about our lives, right? About what kind of services we can get, whether we can access our bank accounts, whether we can access healthcare in certain types of medications and medical interventions, all of these systems are using third party data analytics companies to help them in their work. And that includes LexisNexis and Thompson Reuters companies that also provide major research platforms.

Safa: Of great concern is the fact that this infrastructure is built on and digitizes already existing inequities, putting at risk and serving to further marginalize already underfunded students and scholars. 

Sarah: The first risk is, embedding those systems of injustice, historical bias, racism, xenophobia, what have you, into digital systems, by using data that favors certain institutions, certain professors, it generously overrepresents some types  of scholarship and then under represents others, you know?

And you think about impact factors or, you know, other metrics that just infuse his historic bias into these digital systems.

But then the secondary type is if you are being surveilled – surveillance, especially government surveillance tends to disparately impact certain communities, certain types of institutions and certain individuals, right?

This is a problem that happened in New York city that, you know, people in my law school did a lot of work around, Muslim student were being surveilled, you know, uniquely surveilled, in the wake of September 11th in New York city. And there is actual evidence that that occurred.

And so these types of systems collecting data, if they choose to hand that data over to law enforcement, to other surveillance forces, also is more likely to harm certain academics and certain. 

And without real concerted efforts to create equal playing field and to  not gravitate just towards certain things again and again – you see how certain tech companies become the major tech companies, certain academic institutions become the major academic institutions. Certain academic focuses become the major focuses  to the detriment of everything else. 

So you know, a lot, a lot of people love thinking about artificial intelligence, even though there’s a lot of evidence that artificial intelligence isn’t real, that computers can’t be sentient, that we’re not there yet. And we may never be there with replacing humans with artificial intelligence, artificial intelligence is a buzzword, it’s profitable. 

So imagine academic studies that use the phrase artificial intelligence, and are done at particular institutions that work closely with like Facebook or other big tech companies, those studies get clicked on a lot more, especially by, you know, Twitter and Facebook – like they’re, researchers and they’re well funded and they’re in their using that work and perpetuating that work. So that work gets an artificially high or just a very high impact factor. And it looks like it is very popular. 

So the data analytics companies, you know, Clarivate, Scopus, what have you, and Elsevier data analytics and all these data analytics companies predict that artificial intelligence research is going to be the most profitable, gonna be the next big thing. 

So before you know it major academic institutions are spending all of their money on artificial intelligence labs. Right. And academics who decide to go into artificial intelligence have a much better chance at getting postdocs and getting jobs at these big companies, you know, that’s seen as like the big type of research. So all of the money, all of the talent, all of the interest is being siphoned in one direction by these data analytics companies. Right. So what happens to climate change research? What happens to research you know, critiquing AI and saying, Ooh, maybe AI is not gonna, you know, maybe that’s not the wave of the future, maybe that’s a mistake or that’s not realistic. 

If we let Scopus decide that, or, you know, Clarivate decide that instead of letting the actual academics decide that, I think that just doesn’t bode well for academic decision making and the knowledge enterprise.

Safa: All this being said, there are many groups of stakeholders who are organizing and pushing back against the consolidation of power and untransparent and harmful practices of these companies. 

Sarah: So  it’s really inspiring, especially among librarians who obviously they’re, we are kind of the first line people noticing that this is happening and there’s really a lot of concerted thought and concerted effort around this. 

I think we’re still at kind of this phase where we’re informing everybody about the problem, right. 

I talked to a lot of different people this year and every room that I walk into  – virtual or real, there are people there who haven’t heard about this problem yet. Right. 

So I think we’re still at the phase where we’re explaining these problems to everyone. Right. 

My research is fairly new, Leslie’s research is fairly new, right? So I think now there’s a critical mass of people who are aware of the problems. So now we’re kind of entering this next phase, which is really exciting – organizations like Library Futures and Library Freedom Project are starting to think about, now that we recognize that this is a problem, that this data analytics transition that’s happening with our publishers is a threat to privacy, is a threat to the knowledge enterprise, what do we do next? 

And I think, you know, SPARC is starting to think about setting up a group, a space, I guess, to discuss these issues, right? Kind of a community around these issues that will be librarians thinking about – authentically, like what can they do and what would be best for their institutions?

And then working together in that well organized space.  

And other organizations like Library Freedom Project have made scorecards about rating different products, like which products will give you more privacy, how are these companies potentially using your data? And we’re about to release one about legal research that we’ve been working on together. So that’s really exciting. 

And it’s a really useful tool for people who are concerned. And really other organizations like Library Futures, just thinking about other interventions and other paths forward where we can continue providing library resources and continue being effective, even as these companies transition away from their traditional library service. 

We have found activism and seen activism around this topic popping up in some interesting places. And one of the big ones is shareholders. So for one company, Thompson Reuters, which is a Canadian company, a big union that invests heavily in Thompson Reuters, filed a shareholder resolution asking for Thompson Reuters to identify and examine the financial risks of its participation in ICE surveillance. 

At the time that the shareholder resolution was drafted, Thompson Reuters was the largest data broker for ICE’s surveillance infrastructure, and Thompson Reuters is still a very large participant and has multiple contracts with ICE for personal data.

So the shareholders demanded that Thompson Reuters investigate and report out the potential risk that the company was taking on and exposing its shareholders to as a result of being involved in those potentially human rights abusing practices. And this, I believe it was this year, Thompson Reuters, realized that it was going to have to do that reporting.

And now we’ll be putting out a report about it, with data surveillance and particularly ICE in the future. So yeah, there’s been effective activism from immigrant advocacy groups, but also from shareholders in these companies.

And our academic institutions have a really important and large role to play in the way these companies can access and collect our data.

And I think a lot of institutions aren’t even aware of how much power they wield. So a lot of how much data these companies can collect through their research platforms depends on how much data academic institutions are collecting and how much data they’re allowing to flow through to these products.

I think it was Dorothea Salo’s blog post, where she described the results of the Interfolio request she filed. But I think the way she put it, the ProQuest platform couldn’t collect certain bits of data about her because that data wasn’t collected by her institution, right? It wasn’t attached to her student ID. And so if the institution doesn’t collect the data, then ProQuest can’t collect that data and see that data. Right. 

So a lot of times the information that your own academic institution is tracking will impact what kind of data is flowing out to these third party companies, right?

And so then, you know, the metrics and whatever entities are purchasing from these third parties, we’re not sure who they are.

One thing that has been cropping up a bit and that librarians especially have contacted me about is there’s this kind of clash of interest in academic institutions, between libraries who want to ensure privacy for all of their users and the academic institutions themselves who are concerned with security and theft – they wanna track where people are on campus in case something is stolen or in case somebody is somewhere where they’re not supposed to be.

They want to track who checks out an iPad and who, you know, logs onto a computer at the computer lab, or what have you, in order to ensure that nobody’s doing anything illegal on these platforms, right? 

So there’s this clash of campus security and digital security and then data privacy, because those two things don’t usually get along, right. One involves collection of a lot of data and a lot of tracking of students and faculty and people on campus. And then the other one is the opposite of that. Right? We’re going to expunge data. We’re not going to collect data. 

So I think having just an open frank discussion in academic institutions about where to draw those lines,  how do we balance the interest of security and the interest of privacy and which values are we more interested in protecting?

I think that’s an important discussion to have.  So I’ve actually been contacted by some librarians whose campuses are just all out inserting spyware into all sorts of digital devices on their campus and they’re doing it – their intent is to protect school property and to protect students, right.

They think that that’s a good kind of policing protecting measure, but librarians see it as an invasion of privacy and as another conduit for data that may eventually flow through to these other systems, because once a campus collects data, there’s no, I mean, besides like FERPA and certain privacy laws, that are very limited, there’s really no way to ensure that that data won’t creep out and spill out into other uses.

Safa: Recently, Dr Lamdon was asked to testify in front of congress about her research and expertise on this issue, as they are considering some strategies for regulating the activities of these companies. 

Sarah: It has been a really long, tough road to pass any sort of data privacy legislation in Congress. Some states have been effective, California has passed some data privacy and data broker legislation and Vermont and some other states have also moved in the direction of starting these like data broker registries that would require companies like LexisNexis to register as data brokers and then also provide  people with their data dossier so that people can see and correct their own data. 

So there has been, on a state level, there’s been kind of a patchwork of activity, but federal data privacy activity has been hard to come by.

However, I recently had the opportunity to speak to the House Judiciary Committee and they were considering a law called the Fourth Amendment is Not for Sale Act. So that bill is an effort to impose warrant requirements on third party data providers like LexisNexis and other types of facial recognition, geolocation data providers. 

So basically what people are trying to do is ensure that if the government uses our data dossiers from these companies, from these data analytics companies, they first have to get a warrant which involves going to a court and showing probable cause and getting a particularized warrant for a particularized reason – and right now none of those protections exist. 

So it’s an interesting approach. It doesn’t solve the whole problem, but it’s a good step. And it’s an important step, right? The constitutional protection against unwarranted searches and seizures is important for our privacy and for due process.

So yeah, that’s one of the more popular data privacy and data governance type of laws that’s being considered.

Honestly it’s hard for me to link that kind of narrow thing with this huge problem of the knowledge enterprise, because to me  the Fourth Amendment is Not for Sale Act is interesting, but it doesn’t come anywhere close to solving the problems that we are discussing here. Right? 

It doesn’t, it doesn’t prevent the whole knowledge enterprise from being overtaken by data analytics companies, which is kind of a bummer, which I wish Congress would pay more attention to that problem, because I think it’s also a huge problem. 

And unfortunately I feel like this is very Eurocentric, but most often the law, or the regulation that people ask about is the GDPR in the EU, right? It’s their major data protection regulation. And it also does not solve the problems that we’re describing here, but it does kind of open the door for consumers to see what kind of data being collected by these entities and to ensure that the data that they’re selling is correct. And to kind of participate more in the pipeline of our data use and data sales.

It’s an interesting model, one that the US has not yet followed, but I think it’s a popular model internationally that maybe other countries beyond the EU have also implemented similar kinds of activities. So these regulations don’t stop the data from being collected. They don’t prohibit the data from being sold, but they do make the data collection process less opaque and they allow for more public participation in that process.

Sarah: The one thing I want everyone to know – the main point of my research and the whole reason I wrote the book is I want people to understand that all of these information markets are connected and they’re all being controlled by the same several companies. Legal research, academic research, financial data, and personal data collection and sale – all of those different services are all being offered by the same several companies.

There are just a few companies that are overtaking every informational market, and we should pay really close attention to those companies. Because if we are interested in information flows and information access and data privacy, these companies have a lot of control, even though we don’t discuss them. 

We discuss the big five all the time. We discuss Facebook, we discuss Amazon and Google. But we don’t talk about Reed Elsevier LexisNexis, we don’t talk about Clarivate, we don’t talk about Thompson Reuters. These companies are also multi-billion dollar informational giants and data  giants that deserve the attention of academics, that deserve the attention of regulators.

And we should all be mindful of them. And that’s why I would encourage people to read Data Cartels because it describes and lays out that landscape pretty clearly. 

So my book Data Cartels is coming out November 8th, that’s its publication date, it’s birthday, and it is available for pre-order on the Stanford university press website or wherever you buy your books, it’s available on all the common book selling platforms for pre-order now. 

Safa:  Thank you so much for tuning in. 

If you are provoked by what you heard today, we invite you to join us at the Knowledge Equity Lab. Together we can fundamentally reimagine knowledge systems and build healthier relationships and communities of care that promote and enact equity at multiple levels. 

Please visit our website, sign up for our mailing list, follow us on social media and send us a message to get involved!