Skip to content
This repository has been archived by the owner on Aug 23, 2018. It is now read-only.

API to list package dependents #231

Open
sholladay opened this issue Oct 31, 2017 · 8 comments
Open

API to list package dependents #231

sholladay opened this issue Oct 31, 2017 · 8 comments

Comments

@sholladay
Copy link

sholladay commented Oct 31, 2017

I am the author of squatter, a library that aims to determine package quality and help package authors know when a name on npm is really being used.

One metric I use in squatter to determine if a package is useful is to count the number of dependents a package has. See the implementation here: https://github.com/sholladay/squatter/blob/d1352745d28c5ba76d965cbff9ebe2769f4388b6/lib/has-binary-or-dependent.js#L16-L27

I currently use this undocumented CouchDB view:

https://registry.npmjs.org/-/_view/dependedUpon

I found the dependedUpon view here:
https://github.com/chrisdickinson/npm-get-dependents/blob/3e5a82e6039ddb3a638fa0301f356b39bab898d7/index.js#L40-L47

My understanding is that the registry team officially supports these CouchDB views but would prefer for people to move away from using them directly in this manner. So this is a feature request to provide an API replacement.

Specifically, an API that, given a package name, will return a list of its dependents (packages that depend upon it). Or at least the number of dependents, if not their names. The names would be useful so that I can filter out packages I consider bogus, but a simple count is better than nothing.

@Haroenv
Copy link
Contributor

Haroenv commented Jan 16, 2018

Just leaving another comment here, the dependedUpon view is giving the same number for each package (1703502 currently)

We use:

      got(`https://replicate.npmjs.com/registry/_design/app/_view/dependedUpon`, {
        json: true,
        query: {
          startKey: JSON.stringify([name]),
          endKey: JSON.stringify([name, {}]),
          stale: 'update_after',
        },
      })

@sholladay
Copy link
Author

@Haroenv from what I remember, group_level, skip, and limit are important to get the data correctly. See my links in the original post. I got weird results like yours when I didn't include exactly the right properties.

@Haroenv
Copy link
Contributor

Haroenv commented Jan 16, 2018

Issue I was having is that startkey and endkey are all lowercase now, while they used to be camel cased. However the number is still not in sync with the website

@zeke
Copy link

zeke commented Jan 16, 2018

Here are two efforts to collect this data into an easily consumable offline package format:

https://github.com/nice-registry/dependent-counts
https://github.com/nice-registry/dependent-packages

They're out of date, but if the need is there I can dust them off and apply some automation to keep them fresh.

@Haroenv
Copy link
Contributor

Haroenv commented Jan 16, 2018

that might be an option too, I wonder if it’s faster to download all dependents over the whole registry, or doing an api call for each package (my guess is towards 1). I’ll try out what the time (and install time) differences are @zeke. What process do you have for keeping it updated?

@zeke
Copy link

zeke commented Jan 17, 2018

What process do you have for keeping it updated?

@Haroenv I use a Heroku Scheduler process (think cron) that runs every day (or hour depending on the project).. The process has GitHub and npm credentials. It git clones the repo, runs a build script, runs tests, and if everything passes it checks in the changes and publishes to npm.

The process is outlined here: http://zeke.sikelianos.com/npm-and-github-automation-with-heroku/

As I recall, collecting all the package dependents was quite a time consuming process, and there are undoubtedly better ways to do the actual collection. But having a reasonably up-to-date dataset that people can simply npm install is really nice as a consumer of the data.

@amio
Copy link

amio commented Aug 6, 2018

While waiting npm registry support this officially, you may use Badgen's api:
https://api.badgen.net/npm/dependents/chalk

Or if you are looking for a badge, that's what Badgen do 🤗

WARNING

As @Haroenv 👇point out (thanks!), currently there's a scraper behind this API (why), it might break occasionally if npmjs.com update it's html structure. So don't rely seriously upon it.

Currently since we don't have a fast api for dependents count, if we have, there won't be this issue. Someone will have to do some dirty work for it. If you gonna do it in the same way as I did, this api could be a handy option.

@Haroenv
Copy link
Contributor

Haroenv commented Aug 6, 2018

For future reference, seems like what badgen is doing is requesting the page on npmjs.com and then running cheerio on it (code). This is useful, but I'd love to have a real solution with an API still

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants