Platform Updates ※ research※mesh

Welcome to this week's Platform Updates. In this issue, we introduce a new feature giving administrators an advance preview of upcoming newsletter content. We also cover a completed integration with DataCite, expanding the scope of research materials the platform can capture. Finally, we offer a look into the technical challenge of collecting publication metadata and what the "still searching for metadata" status in your reports signifies.

This week we will roll out some changes to the underlying architecture of our indexing engine—the part of research※mesh that catalogues new journal articles.

The first change you will notice is that newsletter administrators will soon receive a report on the contents of the next regularly scheduled newsletter a few days before it is slated for delivery.

This new report will be like the report you currently receive (pending publications, ORCID overview, etc) but now providing an advanced view ahead of the scheduled newsletter delivery date. We have also significantly enhanced the layout and reporting structure, making this administrative update more like a dynamic “dashboard” for easily monitoring your unit's newsletter.

We are making this improvement so that newsletter administrators are more “in the loop” with the content of upcoming issues before release. This is an important step towards achieving an optimal balance of human oversight while still leveraging the efficiencies of deep automation.

Additionally, this enhanced “pre-publication” report will indicate when the article minimum has not been reached, giving administrators the option to force the newsletter to send anyway in cases where consistent newsletter delivery is required.

The implementation of this new workflow will roll out incrementally across accounts over the next few weeks.

We have completed the DataCite integration, which means that a broader range of research datasets and outputs can be captured in the newsletter (when and if such material is added to your member's ORCID profiles). The research※mesh engine does not search the DataCite database directly for new content, but uses DataCite to identify necessary metadata when needed. If your researchers want to include DataCite registered content in the newsletter, encourage them to add it to the Works section of their ORCID profiles.

If you ask us what the single biggest "technical challenge" of creating research※mesh is, the answer is unequivocal: corralling article metadata.

So here's a nerdy little piece about metadata, just in case you are interested in learning more about what "still searching for metadata" means in your admin report! 🤓

First things first: the research publishing landscape is a jungle. A chaotic one.

The scientific publication ecosystem is a mishmash of legacy systems and new technology platforms cobbled together in dynamic and ever-evolving ways. No single entity is "in charge" of the academic publishing ecosystem—but it is exactly that, an ecosystem. And it is an ecosystem that has some "Wild West" characteristics.

Different publishers (and different conglomerates of publishers) follow different standards for the amount and type of metadata they make available on new articles. Journals themselves can be managed by corporations or by individuals off the side of their desk—with varying interest, attention and commitment to the metadata of the articles they publish. And, when good metadata does exist, it might be strewn across different registries or be inconsistently indexed across various knowledge graphs.

The best part? Every single new system that has been created to address challenges or problems within the ecosystem has, itself, become yet another part of the ecosystem.

Why is sufficient article metadata so important?

To automatically generate content with precision and accuracy, research※mesh requires substantive and dependable metadata as "ground truth." The reason that our use of artificial intelligence is so consistently accurate on this platform is that the "AI part" of the process is actually only a very thin layer of the process, built on top of a much, much bigger stack of linear indexing and retrieval processes to ensure systematic and precise metadata for every piece of research showcased in a newsletter.

When you see "still collecting metadata" status for a publication in the queue, it simply means that the underlying data retrieval processes have not yet been able to establish enough ground truth about the article to proceed.

What causes insufficient metadata for an article?

Missing and incomplete metadata can be the result of lags between when new articles show up in some registries to when the metadata is propagated in other databases and graphs. There can sometimes be errors in the system (such as a publisher incorrectly specifying the DOI for the article) that can take days to months to be corrected across system. It can also be because the publisher or journal has simply not provided the data at all.

The beauty and power of research※mesh is that it automates the task of going back and searching for correct (or corrected) metadata over the course of time. This is a terrific job for a robot and a painfully tedious job for a human. Thank you, robots.

We are still working on the edge cases.

When you receive a "missing metadata" notice in your admin report, so does our team. We get the same alert, too. These are the outlier articles. A top priority for us is innovating solutions for these remaining categories of edge cases. The fact that we have already figured out how to handle so many quirky scenarios so far gives us confidence. However, as we solve the easier problems, the remaining ones become increasingly difficult, which is the inherent nature of edge cases.

If you have been using research※mesh for a while, you have probably noticed that the vast majority of metadata issues eventually resolve over time as the requisite data is identified and retrieved. But, yes, we do still definitely have a few hard nuts left to crack! So we just wanted to provide this little summary to highlight this point: when you see missing metadata notice, we are on the case! 🦸🤖📄