Suggestion: Provide standards around integrity between source code and published package #77

shaunwarman · 2018-12-10T18:45:36Z

Right now, it seems that most maintainers may publish their packages from their local environment. There should be a way to verify what is published against the public source code or specific git sha to maintain transparency of what is being published. Not only will this mitigate out of sync issues or accidents, but will provide greater confidence that additions aren't added as they are published (potentially malicious).

Not sure if this is the best place for this, but after reading through other issues and recent resources I thought I better put this down somewhere. And it brings up the discussion of maintainers permissions to not only package registry, but SCM as well.

ljharb · 2018-12-10T19:42:38Z

npm doesn’t require a repo, and any package with a prepublish build process (read: every Babel user) will correctly have the published package not matching any sha in their git repo.

I’m not sure how we’d be able to do any sort of verification in a consistent and automated way.

mcollina · 2018-12-11T09:10:26Z

The only way to guarantee this is to compute an hash of all the content of a module (in prepublish) and cryptographically sign it. However, there is no easy way to know:

a. what are the keys that are allowed to sign a given package
b. associate those keys with npm profiles

People have been asking for a similar feature to npm for the last 2-3 years.

shaunwarman · 2019-01-08T23:33:06Z

Thanks @ljharb @mcollina and sorry for the late reply.

This article by @skonves gave me a nice reminder of this topic. And it seems like there are tools in his tbv and another in npm-verified that attempt to do similar.

There would need to be a blessed, yet optional process to add some sort of "trusted": true || false or "verified": true || false metadata that npm or other package managers would need to set. Optional because of the common cases of transpiled code (e.g. babel, etc) and because this is an expensive task with trade offs.

sompylasar · 2019-01-09T05:56:31Z

My 1 cent out of 2 (don't have time for more): npm doesn't require a repo, but a package verification service on top of npm might require whatever is needed, and package authors that want to conform and get the "verified" badge will try and make sure their package is good. For example there are similar efforts to standardize open source repositories: https://github.com/todogroup/repolinter — there could be tools that help package authors to manage packages efficiently (I personally miss the kind of tool that would automatically set up CI and releases for me reproducibly and reliably).

ljharb · 2019-01-09T16:30:21Z

Indeed, that’s something npm can solve - but i don’t think it’s something node, and thus we, can.

niftylettuce · 2019-02-13T20:31:48Z

Chiming in here, we really need something like this... koa-router was just transferred to a relatively unknown user on GitHub and the package name was apparently sold

ZijianHe/koa-router@bd780c9#diff-04c6e90faac2675aa89e2176d2eec7d8R3

Screenshot (in case commit is force deleted):

I've version locked koa-router, which is downloaded 135K+ times per week, and subscribed on https://libraries.io/npm/koa-router to get a notification when a new version is published of this package.

HN: https://news.ycombinator.com/item?id=19156707

ljharb · 2019-02-13T20:46:05Z

@niftylettuce i'd suggest reporting that to npm; i doubt their TOS permits the sale of a package name.

niftylettuce · 2019-02-13T20:47:35Z

I did report that to NPM, and here was there response...

I also reached out to the author of the package koa-router and received a very negative response which I don't wish to share publicly out of respect since I see no malicious version of the package published yet.

justinmchase · 2019-02-13T22:17:06Z

@niftylettuce You did the right thing here even if nothing negative comes of it. Its a red flag for sure and others deserve to be alerted by this information. Good job on being alert.

niftylettuce · 2019-02-13T22:24:06Z

To anyone reading this, I'm building a tool to automate this nonsense, at least until Node/NPM do something about it. Email me at niftylettuce@gmail.com if you want to get notified once it's up. I'll notify everyone that posted in this thread and/or left reactions as well. It will be free and open-source.

Enrico204 · 2019-02-13T22:31:08Z

Maybe it's time to think to an NPM alternative.

iarna · 2019-02-14T01:55:55Z

What's more, even if npm validated against the git repo at publish time, the user can just force push after publication. Checking this kind of thing does nothing.

If you want to validate at install time, well, I hope you enjoy your multi-hour install times. =p (Seriously: In modern npm or yarn, install time per package is under 10ms—adding a git clone to that mix would massively increase overall run time.)

This kind of action gets you no security whatsoever. It solves no actual problem. It validates nothing.

freewil · 2019-02-14T02:02:33Z

@iarna what do you think about a new command for npm, I've been thinking about a npm diff that you would be able to use while upgrading a package to see actual changes. Haven't thought about it super deep, but npm update or npm install mypkg@latest followed by a npm diff or similar.

This would effectively allow you to review the code of new/updated packages, similar to how adding node_modules to version control would allow for deeper code reviews.

jaredhirsch · 2019-02-14T02:04:54Z

@niftylettuce Hey, if you're going to work on a solution, it would be way better to start a repo on github, so people can contribute ideas/feedback via issues

iarna · 2019-02-14T02:06:00Z

@freewil That seems much more useful, but I am concerned how you scale that out to the thousands of deps in a typical modern deployment. 'cause yeah, the one thing you asked for may have a reasonable diff, but what about the dozen transitive deps that also updated? Still, I think this would be an excellent place to begin experimenting with, to see how it feels (--dry-run --json can get you what an action would have done, in machine readable format).

niftylettuce · 2019-02-14T02:07:27Z

@6a68 yes it will be on GitHub, I will post the link here once I have a proof of concept up

niftylettuce · 2019-02-14T02:40:05Z

To anyone reading this - please do not harass, email, or contact the original maintainer of the package mentioned in the above discussion. It was never my intention for anyone to harass them. I simply wanted to raise awareness about this issue with NPM and the potential of this becoming a security issue in general. This is not the only package like this.

niftylettuce · 2019-02-14T03:15:40Z

Another update on the koa-router issue for anyone subscribed to this thread. The new maintainer @ZijianHe has provided us with an update ZijianHe/koa-router#494 (comment). Hopefully this eases concern and it looks like we have a new contributor in open source land.

For the CLI npm diff command, would it just accept two different versions of a package? npm diff <package> <version-a> <version-b> and diff compare tarball?

See nodejs/package-maintenance#77 (comment) Might be that nothing malicious is done with the package after a transfer but let's rather be safe than sorry.

Enrico204 · 2019-02-14T11:38:08Z

Maybe the best "way" to handle the change of a maintainer is that the new maintainer should open its own repository on NPM. They should not allow a "takeover" procedure: instead, the installation may fails with a message like "hi, this library is not maintained anymore by olddev, there is a new version in newdev". At least the developer knows that there is something going on..

mcollina · 2019-02-14T11:42:27Z

I think the current way npm does this is by forcing a bump in a major release. That's enough to protect users form a malicious "takeover" in the case of a non-responsive maintainer.

panva · 2019-02-14T11:44:19Z

I think the current way npm does this is by forcing a bump in a major release. That's enough to protect users form a malicious "takeover" in the case of a non-responsive maintainer.

Is that so? Any way to confirm?

mcollina · 2019-02-14T11:48:27Z

I've done this several time, and that's how it works. I looked in https://www.npmjs.com/policies/disputes but there is no mention about that.

ljharb · 2019-02-14T17:02:37Z

They certainly don’t; and can’t, because a legitimate new maintainer should be able to backport fixes as needed anyways.

Any owner can always and forever publish to any previously unused version number, and that’s how it must stay.

sompylasar · 2019-02-14T17:19:38Z

@iarna

What's more, even if npm validated against the git repo at publish time, the user can just force push after publication. Checking this kind of thing does nothing.

I assume you assume the evil user force pushes to remove malicious code?

If such force push leads to file changes, git commit ids will change. Npm would record the commit id it verified against, and a standalone checksum of published files. If there's no such commit in the repo, it's a red flag. If running the same checksum at the same commit id files results in a different checksum, it's a red flag.

I believe this verification has to be a background task in npm; if one of the red flags has triggered, this package downloads are paused to prevent further spread until they are resolved. There has to be cache invalidation mechanism that would notify downstream caches to not use the flagged package.

I'm in no way a security expert though, just brainstorming. What do the npm security people that npm aquihired say?

sompylasar · 2019-02-14T17:32:03Z

The way I experimented in npm-verified POC is I needed a standardized package build+publish process (I used npm prepare for that). The packages that conform to the process can be marked as verified if the checks pass. The packages that do not — never. Like in Chrome, the sites that do not use HTTPS are marked as "Not Secure" by default, and even more, colored in red if they collect some user input via a form (for npm, this can be translated to using some OS APIs like process, network, and filesystem).

freewil · 2019-02-14T18:44:35Z

@niftylettuce

For the CLI npm diff command, would it just accept two different versions of a package? npm diff <package> <version-a> <version-b> and diff compare tarball?

Yeah, I think that would be one version/use-case of the command. The way I'm currently thinking, it'd be nice to also have a no argument version similar to git diff where the current unstaged/uncommitted changes are shown, but that may be more complex than necessary for an initial version as it might require git/version control integration. I'd be hesitant in entangling npm with version control-specific code, but I think there is already a precedent with some commands - one that I'm aware of is npm version, which creates git tags for you.

freewil · 2019-02-14T18:48:28Z

@freewil

but that may be more complex than necessary for an initial version as it might require git/version control integration.

One alternative to prevent the need for tracking changes/state would be just to add a --diff flag to npm install that would output a diff. That should be much easier since npm install already mutates package.json (which means it needs to/could be aware of the version change while running) and prevents needing to track changes/state across two commands, as I originally proposed (npm install followed by npm diff).

mcollina · 2019-02-15T22:36:53Z

lockfiles are a very good idea for applications, which is the high majority of users. However, they are not really the target of this initiative.

dominykas · 2019-02-15T22:43:42Z

@mcollina this is not about lockfiles. I don't think they make sense as part of the package or part of the repo. They do make sense as a snapshot in time to achieve reproducible builds.

Once again - they should not be published on git or npm - only kept around as a convenient metadata format.

ljharb · 2019-02-15T22:54:39Z

@dominykas package-lock is already not published on npm; it'd be fine to keep a lockfile around if it didn't constrain npm install, but the only way to do that would be to rename it and commit it to git.

That might be reasonable, however - a sort of package-release.json that you'd update in-place in git, and would effectively be npm install --package-lock-only --package-lock && mv package-{lock,release}.json right before publishing a release.

dominykas · 2019-02-15T23:25:23Z

@ljharb that is also an option, but it depends on the toolchain. I've literally spent this week setting up semantic-release, because we have locked masters and therefore can't have version commits. This also means that we need to push release meta data outside of git repo. We also have a pattern where we push shrinkwraps into dangling tags (i.e. a shrinkwrap-X.Y.Z for every vX.Y.Z). Basically anything but master (or npm, for various reasons) :)

I wish npm had a way to store this, and some other, release meta data, but not necessarily distribute it with the package.

The ownership of the koa-router package was transferred to an "unknown" user. See more on Github: nodejs/package-maintenance#77, Hacker News: https://news.ycombinator.com/item?id=19156707 and the npm blog: https://blog.npmjs.org/post/182828408610/the-security-risks-of-changing-package-owners

styfle · 2019-02-19T15:49:52Z

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

ZEIT Now is probably 80% of the way there because builds are immutable, every build has a unique URL, the source code is captured during the build process, and you don't need a repo to publish. But if you do have a repo, you can connect it to the build system so that pushing to master (or any branch) automatically publishes.

There's a couple changes that would need to be implemented in npm:

The first missing piece is some way to link the published npm package back to the builder source so that npm (and users) can verify integrity (probably add a field in the generated package.json).
The second missing piece is some sort of allow-list that npm uses to trust certain builders (if anyone could host their own builder service, they could circumvent this integrity).
The third piece is an optional flag that users can use to opt-in to this type of integrity check. Something like npm install --builder-integrity that will stop with an error if any package is missing this new builder field in the package.json or the builder field is not in the allow-list.

Bene-Graham · 2019-02-19T21:59:05Z

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

ZEIT Now is probably 80% of the way there because builds are immutable, every build has a unique URL, the source code is captured during the build process, and you don't need a repo to publish. But if you do have a repo, you can connect it to the build system so that pushing to master (or any branch) automatically publishes.

There's a couple changes that would need to be implemented in npm:
1. The first missing piece is some way to link the [published npm package](https://react-tsx.now.sh) back to the [builder source](https://react-tsx.now.sh/_src) so that npm (and users) can verify integrity (probably add a field in the generated `package.json`).

2. The second missing piece is some sort of _allow-list_ that npm uses to trust certain builders (if anyone could host their own builder service, they could circumvent this integrity).

3. The third piece is an optional flag that users can use to opt-in to this type of integrity check. Something like `npm install --builder-integrity` that will stop with an error if any package is missing this new builder field in the `package.json` or the builder field is not in the _allow-list_.

I was initially thinking the same thing.

But what stops someone from having a build task that inject malicious code into the "built package"?

Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

freewil · 2019-02-20T00:08:11Z

@styfle

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

I was also thinking this. Basically the idea is to have a trusted third-party do the packaging from source.

Docker Hub might be a good place to look for inspiration. With Docker, it can be an even bigger issue, as docker images are built into binary files and pushed up to a public repository. Although, Dockerfiles/images are generally much more simple.

https://blog.docker.com/2013/11/introducing-trusted-builds/

sompylasar · 2019-02-20T00:31:05Z

But what stops someone from having a build task that inject malicious code into the "built package"?

There shouldn't be any user-provided build task code while using one of the trusted builders, only a build pipeline configuration, for example, like https://github.com/pikapkg/pack does.

Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

Yes, that, too, may increase confidence, and complexity. That is needed, too, but is orthogonal to this particular discussion about package build-from-source integrity.

ljharb · 2019-02-20T01:19:36Z

The build process itself needs to be able to contain user-written transforms, if it's babel, or a bundler, to support common and valid use cases.

The ownership of the koa-router package was transferred to an "unknown" user. See more on Github: nodejs/package-maintenance#77, Hacker News: https://news.ycombinator.com/item?id=19156707 and the npm blog: https://blog.npmjs.org/post/182828408610/the-security-risks-of-changing-package-owners

sompylasar · 2019-02-20T05:26:55Z

The build process itself needs to be able to contain user-written transforms, if it's babel, or a bundler, to support common and valid use cases.

It doesn't if all the common and valid use cases like babel and bundler are standardized into trusted builders. I guess there should be either no way, or a hard and impractical way, to tamper with the executable content (e.g. JavaScript code) of a trusted build by providing a malicious build step.

But again, I'm not a security expert. I pinged the Node.js SecurityWG to take a look at this discussion as I think it is rather opitionated, not engineered as a document with clear sets of requirements, tradeoffs and solutions.

travi · 2019-02-20T19:00:42Z

It doesn't if all the common and valid use cases like babel and bundler are standardized into trusted builders

i think you missed a very important detail in the point you quoted:

contain user-written transforms

part of the valid use of babel is to apply custom transformations beyond those provided by babel itself or even official plugins. these are often included as devDependencies. locking down approved builders that didnt allow the inclusion of those custom transformations would eliminate a significant portion of the value of tools like this, but allowing them does allow for changes that would be difficult to ensure are not malicious.

sompylasar · 2019-02-20T22:32:11Z

part of the valid use of babel is to apply custom transformations beyond those provided by babel itself or even official plugins. these are often included as devDependencies. locking down approved builders that didnt allow the inclusion of those custom transformations would eliminate a significant portion of the value of tools like this, but allowing them does allow for changes that would be difficult to ensure are not malicious.

I'd like to see some data about how popular these are, and whether they can be standardized. They can be published as a trusted package that itself does not use them, thus building a chain of trust. I hear you, but there's always a tradeoff between security and convenience.

ljharb · 2019-02-20T22:48:37Z

@sompylasar packages are all already implicitly trusted; the attack vector i thought you're trying to address here is when a trusted package is hijacked, for which the same risks exist for any package, including ones in the build system.

sompylasar · 2019-02-20T22:54:09Z

@sompylasar packages are all already implicitly trusted; the attack vector i thought you're trying to address here is when a trusted package is hijacked, for which the same risks exist for any package, including ones in the build system.

Yeah I guess you're right. Well that's what I mentioned above. We all would benefit from a document (like an RFC) with the list of potential attack vectors, and how they are addressed or can be.

skonves · 2019-02-20T23:44:50Z

I have actually already put together a draft of a graph of the NPM ecosystem for just that purpose. Each edge and vertex definitely has its own unique set of challenges. For the past few evenings, I have been working on a writeup, but I wouldn't mind opening it up for community feedback/contribution.

Here is the draft of the graph I have so far:

I have a draft on Medium right now, but I will move that to a more public place in a heartbeat if it means more eyes on it. Thoughts?

FWIW, my work with TBV/verifynpm and @sompylasar's similar work with npmverified seems reasonably confined to the "publishes to" edge near the top; however, I strongly agree with Ivan that understanding the full picture helps inform any proposed solution (particularly what is in and out of scope).

JamesMGreene · 2019-02-21T02:49:38Z

Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

I saw an interesting post about this idea from the NPM perspective not too long ago: https://hackernoon.com/npm-package-permissions-an-idea-441a02902d9b

CC @davidgilbertson

tlhunter · 2019-02-26T21:15:41Z

FWIW, we built Package Diff to gain visibility into npm package updates by directly examining the package instead of looking at the potentially-misleading source code repository:
https://diff.intrinsic.com

sompylasar · 2019-02-26T22:03:51Z

FWIW, I built package diff for the purpose of comparing next prepared version with the latest published one for the review and version bump purposes https://github.com/sompylasar/zmey-gorynych/blob/9029972/bin/cli.js#L512

justinmchase · 2019-02-27T21:51:14Z

@tlhunter very nice.

I had one thought, one way to get around it. What if you were using node-pre-gyp which fetches pre-compiled binaries rather than builds them from the source in the package and someone malicious had uploaded a pre-compiled binary from something other than the source that is published, how could we verify that they match?

tlhunter · 2019-02-27T23:20:32Z

@tlhunter very nice.

I had one thought, one way to get around it. What if you were using node-pre-gyp which fetches pre-compiled binaries rather than builds them from the source in the package and someone malicious had uploaded a pre-compiled binary from something other than the source that is published, how could we verify that they match?

For the system to be perfect we'd need to throw away the concept of npm install scripts. One could perform an npm install of every package ever made, on a set schedule, and constantly diff the artifacts left behind, but even that wouldn't be perfect. Like, the fetching of pre-compiled binaries might include information about the host machine. The event-stream incident, for example, would only inject malicious code if certain criteria were met.

I'd like to see some sort of package manager service which does the following:

Distributes both a tarball AND is the git repo
1. The user wouldn't be able to generate tarballs
2. The server would guarantee correlation between git tags and tarballs
Disable all installation scripts

justinmchase · 2019-02-28T16:39:32Z

Yeah, I agree that sounds like the only way. I could imagine npm flagging releases that were published that way and you could then opt to only accept verified dependencies if you wanted.

It would be tricky for pre-compiled binary modules as they could have a large build matrix for platforms and versions but it would be enormously helpful for confidence and security.

shaunwarman · 2019-03-08T03:34:51Z

Interesting and related proposal in the golang ecosystem:
https://go.googlesource.com/proposal/+/master/design/25530-notary.md

gokaygurcan mentioned this issue Feb 13, 2019

koa-router EmakinaTR/makina-api#37

Open

javiertury mentioned this issue Feb 13, 2019

Trustworthiness of koa-router dependency koajs/joi-router#82

Closed

basti1302 added a commit to instana/nodejs that referenced this issue Feb 14, 2019

deps(test): pin koa-router for now

cb31c6d

See nodejs/package-maintenance#77 (comment) Might be that nothing malicious is done with the package after a transfer but let's rather be safe than sorry.

eysi09 mentioned this issue Feb 19, 2019

chore(node_modules): lock koa-router version garden-io/garden#549

Merged

azu mentioned this issue Feb 23, 2019

Package manager jser/report#2

Open

dominykas mentioned this issue Feb 27, 2019

Publish a lock file as part of GH releases semantic-release/github#162

Closed

vweevers mentioned this issue Jun 17, 2019

Support file #220

Merged

Eomm added discussion The decision process is still ongoing vendor Some vendor need to be onboard labels Sep 13, 2019

Suggestion: Provide standards around integrity between source code and published package #77

Suggestion: Provide standards around integrity between source code and published package #77

Comments

shaunwarman commented Dec 10, 2018 • edited

ljharb commented Dec 10, 2018

mcollina commented Dec 11, 2018

shaunwarman commented Jan 8, 2019

sompylasar commented Jan 9, 2019

ljharb commented Jan 9, 2019

niftylettuce commented Feb 13, 2019 • edited

ljharb commented Feb 13, 2019

niftylettuce commented Feb 13, 2019 • edited

justinmchase commented Feb 13, 2019

niftylettuce commented Feb 13, 2019 • edited

Enrico204 commented Feb 13, 2019

iarna commented Feb 14, 2019 • edited

freewil commented Feb 14, 2019

jaredhirsch commented Feb 14, 2019

iarna commented Feb 14, 2019

niftylettuce commented Feb 14, 2019

niftylettuce commented Feb 14, 2019

niftylettuce commented Feb 14, 2019

Enrico204 commented Feb 14, 2019

mcollina commented Feb 14, 2019

panva commented Feb 14, 2019

mcollina commented Feb 14, 2019

ljharb commented Feb 14, 2019 • edited

sompylasar commented Feb 14, 2019

sompylasar commented Feb 14, 2019

freewil commented Feb 14, 2019

freewil commented Feb 14, 2019

mcollina commented Feb 15, 2019

dominykas commented Feb 15, 2019

ljharb commented Feb 15, 2019

dominykas commented Feb 15, 2019 • edited

styfle commented Feb 19, 2019

Bene-Graham commented Feb 19, 2019

freewil commented Feb 20, 2019 • edited

sompylasar commented Feb 20, 2019

ljharb commented Feb 20, 2019

sompylasar commented Feb 20, 2019

travi commented Feb 20, 2019 • edited

sompylasar commented Feb 20, 2019

ljharb commented Feb 20, 2019

sompylasar commented Feb 20, 2019

skonves commented Feb 20, 2019

JamesMGreene commented Feb 21, 2019 • edited

tlhunter commented Feb 26, 2019

sompylasar commented Feb 26, 2019

justinmchase commented Feb 27, 2019

tlhunter commented Feb 27, 2019

justinmchase commented Feb 28, 2019

shaunwarman commented Mar 8, 2019

shaunwarman commented Dec 10, 2018 •

edited

niftylettuce commented Feb 13, 2019 •

edited

niftylettuce commented Feb 13, 2019 •

edited

niftylettuce commented Feb 13, 2019 •

edited

iarna commented Feb 14, 2019 •

edited

ljharb commented Feb 14, 2019 •

edited

dominykas commented Feb 15, 2019 •

edited

freewil commented Feb 20, 2019 •

edited

travi commented Feb 20, 2019 •

edited

JamesMGreene commented Feb 21, 2019 •

edited