We Need To Talk about OER Discovery

      Comments Off on We Need To Talk about OER Discovery

Last November I was part of an Open Education Conference 2020 panel entitled “We Need To Talk about OER Discovery.” Six questions focused the discussion, with each panel member contributing their thoughts. Here are my responses:

How would you describe the current state of OER discovery?

Let me start by saying that six or seven years ago, surveys routinely showed that simply “finding” OER content was the most significant barrier to adoption. For several years after you could count on seeing some mention of “difficulty finding OERs” in articles and reports.

I’m happy to say that in recent years–assuming you stay away from commercial publisher’s brochures–you rarely see issues around OER discovery getting those 30 point headlines.

But what I do still see, three years after we launched our Mason OER Metafinder is that while it’s true we don’t talk so much about simple discovery anymore…it seems we’re all thinking more than ever about the need for more efficient discovery. 

What do I mean by more efficient?

  • higher signal to noise ratio in retrieval
  • less duplication of content in our search results
  • unambiguous usage rights for every item retrieved
  • and way to quickly assess the pedagogical “fit”

If we had these last two bits of information–usage rights and pedagogical fit–we could easily slice and dice our result sets via facets.

If we ignore for a moment the traditional discovery solution–wherein the searcher dives in and out of various silos looking for appropriate material–there are two approaches to solving the discovery problem across multiple content providers. 

  • use a just-in-case system like MERLOT or SUNY’s OASIS. The “just-in-case” tag comes from supply chain management…you store a lot of inventory just in case someone needs it. Here, the inventory is the metadata that’s harvested from various OER content providers. That metadata is normalized to some degree…then indexed…and it’s that index that you’re searching). Results are limited to items the search system knows about.
  • or use a just-in-time system like the Mason OER Metafinder. Ours is a “just in time” system because, again drawing an analogy to supply chain management, we maintain no stock but rely on prompt delivery from our suppliers: OER content silos. Instead of searching a pre-built index, when you submit your query the metafinder launches up to 21 real-time parallel searches across each of the up to 21 sites that you have asked to search. It then collects, dedupes and ranks the top 100 results from each of these sites, combining all into a single, faceted results set.

For now, I’ll conclude by pointing out that each of these approaches has advantages and disadvantages and as you might expect, each poses dramatically different maintenance requirements. To help bridge the gap between these approaches, we include metadata aggregations like MERLOT and OASIS along with content providers as search targets in our Metafinder. 

What are the main challenges/specific needs can you identify at this time?

We have several interesting issues in the OER content world that complicate discovery. 

  • First, there’s very little standardization of metadata beyond Author and Title and publication date. And across repositories, even those simple and seemingly straightforward metadata elements tend to drift a bit.
  • Then, there’s willful duplication of content across repositories. That redundancy is useful I suppose in a world where repository sustainability is always a concern…but once you open up cross-repository searching, it poses complications. For example, looking at results in the OER Metafinder, I’ve noticed that sometimes the same content is in two or more two or more repositories but there will be a slight variations in the author/title metadata on each site. Hard to teach a machine to unravel that duplication or how to select the most appropriate copy.

What approach(es) do you think would best address these needs?

So from my vantage point–which is trying to offer a search engine that increases search efficiency – the key to fixing many of these issues is standardizing on a particular metadata schema for OER content…and then devoting time to enriching that descriptive metadata.

If I could just issue a decree, it would be that the community settle on a metadata schema that suits at least the basic needs of all interested parties. By that I mean let’s not follow our natural librarian impulse to over-engineer the solution before we deploy it but let’s focus on figuring out the minimum that improves on the current state of affairs but also offers an extensible design that can evolve and improve as we work with it. That simplicity will also speed adoption of the schema.

Then I’d give preference to those repositories and content providers that utilize the schema. 

What, if any, success stories do you know of?

I think one success I’ve noticed is the growing worldwide reach of OERs and the inter-connectedness of the world when it comes to OER discovery.

I try to track any library, libguide or webpage that provides a searchbox or link to our OER Metafinder. I post a list of those sites on our “About the Metafinder” page…and from the list link back to the page that links to our service. What that has turned into is a quick spot to view more than 400 OER-related libguides, websites or services. More than once I’ve heard from people who appreciate being able to so quickly find OER advocacy materials from literally around the world. For example, this past month, among the top 25 sites sending traffic to our Metafinder were sites in South Africa, Australia, Canada, Kenya and Taiwan and the Netherlands.

How can we work to reduce silos in OER discovery initiatives?

In the absence of any sort of cross-repository search mechanism, it’s absolutely true that the problem of discovery becomes more difficult as the number of silos increases. There’s a limit to a searcher’s energy and diving in and out of silos all day can be exhausting. If, however, we have a more standardized way of surfacing the relevant content of each silo then the number of them is more a computer scaling issue than it is a burden to searchers. So we’re at something of a fork in the discovery road. Do we think about how to reduce the number of places you have to look or do we think about how we might build a single virtual OER database out of the many siloed repositories.

What role, if any, does accessibility and equitable data play in OER discovery?

Equitable data or Data Equity, is all about finding and eliminating the ways that bias, assumptions, unfairness and prejudice can slip into a data project. I suppose you can find traces of those problematic impulses somewhere in the OER discovery universe but my sense of things is that the OER movement in general is already quite far ahead of many other activities in valuing the open and equitable. So are there areas where we can improve? I can think of one, and that would be making sure bias and prejudices are not reflected in the metadata we develop in hopes of improving discovery. Thinking about equitable access, I think we might also work to insure that our discovery platforms and our content delivery sources support the simplest, least-expensive device capable of reasonable function–rather than requiring an expensive computer to enjoy the best experience.