npm Blog (Archive)

The npm blog has been discontinued.

Updates from the npm team are now published on the GitHub Blog and the GitHub Changelog.

search update: improved search in the npm CLI (and how we got here)

Over last month’s holidays, with the help of npms.io, npm introduced an improved search platform and brought it to the npmjs.com web experience. We’re really proud of how this project went: it was an opportunity to work with folks in the community and pull in an open-source solution that people love.

As we promised at the time, here are some more details about the how and the why, and an exciting announcement about bringing new search to the npm command-line tool.

a long time ago in a galaxy far, far away Oakland, California…

It turns out we’ve improved search several times in the life of the company, and the story of search, like any story about npm, is a story about the JavaScript community’s terrifyingly ridiculous growth:

At each of the steps along the way, we’ve had to make significant changes to our search algorithm, to support the growing ecosystem.

chapter 1, 2010–2012: the early years

When the registry was just getting started, guessing a few keywords was a great way to find the module you were looking for, e.g., “http request”, “xml parser”, “node globber”.

search in the CLI

npm’s first search implementation was exclusively for the CLI. This first search implementation quite simply:

  1. walked all packages, both those in the registry and installed locally;
  2. returned any packages with description, keywords, name, etc., matching the arguments provided to npm ls [some key words].

That’s all there was to it: no stop-word removal, no stemming, no fancy-pants search-engine technology.

With only a few hundred packages in the registry, this worked great … for a while.

search.npmjs.org

In December 2010, just a few months after we released search for the CLI, Mikeal Rogers implemented search.npmjs.org.

search.npmjs.org

Mikeal’s code introduced several improvements over the initial search implementation:

search.npmjs.org was definitely a step forward for search; it also set in motion the npm website’s search drifting away from the npm CLI’s search… something that’s taken us until now to correct.

chapter 2, 2012–2014: the growth of the Node.js ecosystem

At a few hundred packages in the registry, the approach to search described above worked great, but as the ecosystem grew and users adopted the tiny module method to development, search began to fail:

To help address this growing discoverability problem, several implementations of search grew out of the community. These third-party search sites introduced many cool innovations:

In 2014, npmjs.com adopted the indexer used by npmsearch.com. This significantly sped up search results, while also improving the discovery algorithm by ranking based on download counts.

This is was a major improvement to the search algorithm, and a step in the right direction, but…

chapter 3, 2014–2017: npm, Inc., explosive growth, npms.io

When npm, Inc. formed in 2014, our first goal as a company was to make the registry a stable platform that people took for granted. As we stabilized the registry, this plan paid off. More ecosystems began calling the registry home: jQuery, React, and Meteor, to name a few. Between 2014 and early 2017, this helped see the number of modules in registry climb to over 400,000! … but our search algorithm did not age well:

Chris Zubak-Skees on Twitter: NPM search sucks so much it pulled the moon into a closer orbit.

As we researched the other search engines people used in the community, it became obvious that people were impressed by the quality of the results returned by npms.io:

Jeroen Engels on Twitter: @SamVerschueren @npmjs First entry of npms.io though. Looks like that one is becoming my first choice to search npm.

This set in motion a conversation with the folks behind npms.io, and culminated in our deciding to deploy npms.io as npm’s third-generation search.

npms.io is by far the most advanced npm search algorithm npm has ever offered. npms.io’s analyzer takes into account three categories of information in its ranking:

By ranking results based on this variety of qualities, the algorithm can surface modules that in the past might have been ignored. express is the top hit for “web framework”, for example, despite not having “web” or “framework” in its name.

So far, the response from the community has been wonderful, and we’re excited to continue working with and deploying the npms.io project.

chapter 4: the future, starting… now

What’s next for search at npm?

search improvements in the CLI

We think this is very exciting news. An upcoming update to the npm command-line tool makes it so the CLI hits the shiny new search endpoint. This will unify the website and CLI search experience for the first time since 2010. It will also make default npm search on the main registry blazing fast:

The PR is basically ready, with only a handful of remaining to-dos. Check it out.

you!

As mentioned, npms.io is an open-source project. We hope that the JavaScript community will to pitch in to continue to make our search algorithm top-notch.

Where’s feature x? What took so long? How will search work when we reach a million packages? These are good questions, and you can help with the answers. Please, join the discussion, and help make search even more amazing.