8000 Remove arbitrary limits on OP_Return (datacarrier) outputs by petertodd · Pull Request #32359 · bitcoin/bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Remove arbitrary limits on OP_Return (datacarrier) outputs #32359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

petertodd
Copy link
Contributor
@petertodd petertodd commented Apr 27, 2025

As per recent bitcoindev mailing list discussion.

Also removes the code to enforce those limits, including the -datacarrier and -datacarriersize config options.

These limits are easily bypassed by both direct submission to miner mempools (e.g. MARA Slipstream), and forks of Bitcoin Core that do not enforce them (e.g. Libre Relay). Secondly, protocols are bypassing them by simply publishing data in other ways, such as unspendable outputs and scriptsigs.

The form of datacarrier outputs remains standardized: a single OP_Return followed by zero or more data pushes; non-data opcodes remain non-standard.

CC: sipa darosior

@DrahtBot
Copy link
Contributor
DrahtBot commented Apr 27, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32359.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept NACK luke-jr, BitcoinMechanic, Retropex, 1ma, nsvrn, jesterhodl, chrisguida, wizkid057, Seccour, Fiach-Dubh, rstmsn, Turtlecute33, danielabrozzoni, shahsb, theDavidCoen, bkkarki21, moth-oss, donaldevine, TheGuySwann, crownbtc, ctrlbreak-, matthew-ellis, billylindeman, sonoranai, LaurentMT, BTCMcBoatface, jackedproxy, leCheeseRoyale, mrberlinorg, francisco-alonso, cezar1, i5hi, Kakar21, Ali2kCom, ake-khada, smbpunt, Specter2100, adamdecaf, Tech1k, bubelov, murrayn, alfredopalhares, odarboe, iFadi, dominickbrasileiro, samurai321, davidhrinaldo, keith-gardner, captCovalent, walkjivefly, va7wv, pithosian, liviu-liviu, juanitoddd, donaldevinev1, Hackzero00
Concept ACK Rob1Ham, jlopp, polespinasa, Christewart, torkelrogstad, michael1011, jamesob, eragmus, glozow, Jeremy-coding, Psifour, hsjoberg, jaonoctus, murchandamus, cbspears, owenstrevor, fjahr, owenkemeys, sbddesign, SergioDemianLerner, nud3l, Haaroon, snakezhu, scgbckbone, stevenroose, Zodomo, carnhofdaki, aryaethn
Approach NACK 1440000bytes, Cyberwiz9000
Stale ACK reardencode, darosior, Sjors, instagibbs, RandyMcMillan
Ignored review miketwenty1, albertoig, monlovesmango

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #32406 (policy: uncap datacarrier by default by instagibbs)
  • #32133 (RFC: Accept non-std transactions in Testnet4 by default again by fjahr)
  • #29954 (RPC: Return permitbaremultisig and maxdatacarriersize in getmempoolinfo by kristapsk)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Debug: previous releases, depends DEBUG https://github.com/bitcoin/bitcoin/runs/41237737205
LLM reason (✨ experimental): (empty)

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

Also removes the code to enforce those limits, including the
`-datacarrier` and `-datacarriersize` config options.

These limits are easily bypassed by both direct submission to miner
mempools (e.g. MARA Slipstream), and forks of Bitcoin Core that do not
enforce them (e.g. Libre Relay). Secondly, protocols are bypassing them
by simply publishing data in other ways, such as unspendable outputs and
scriptsigs.

The *form* of datacarrier outputs remains standardized: a single
OP_Return followed by zero or more data pushes; non-data opcodes remain
non-standard.
Copy link
Member
@luke-jr luke-jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per ML discussion, firm Concept NACK.

Copy link
Member
@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

@BitcoinMechanic
Copy link

Concept NACK. Hard to take this seriously. Those config options not having direct impact over what miners may put in blocks does not equate to users no longer having a choice over what ends up in their mempools.

@Retropex
Copy link

Concept NACK. If miners want larger datacarrier transactions, they can use these settings to do so.

There’s no reason to prevent miners and node runners from making this choice.

@1ma
Copy link
1ma commented Apr 27, 2025

Again? #28130

Concept NACK, for the same reasons already discussed 2 years ago.

@petertodd
Copy link
Contributor Author

@1ma Please read the relevant mailing list discussion first: https://groups.google.com/g/bitcoindev/c/d6ZO7gXGYbQ

There are good reasons why this is being brought up again.

@1ma
Copy link
1ma commented Apr 27, 2025

Already did. I'm actually waiting for someone to add a new message so I can add mine with the data I gathered (just joined the mailing list).

If you care to actually look at the blockchain turns out no one is trying to get large OP_RETURNs mined: https://github.com/1ma/blockstats

@petertodd
Copy link
Contributor Author

@1ma As was discussed in the mailing list discussion, entities are using unspendable outputs in liu of OP_Return outputs. Precisely because of the size limit. This increases the UTXO set size unnecessarily, a harmful effect of having the arbitrary OP_Return output limitations. Your analysis does not take that kind of issue into account.

Rather than do a stream of incremental increases, wasting dev bandwidth, this pull-req simply removes those arbitrary limits.

Anyway, this kind of non-technical discussion seems off-topic for this pull-req. Best to do it on the mailing list. And furthermore, people have choice - Knots exists. If they wish to enforce these limits on their own nodes they're welcome to run Knots. There's no reason why Bitcoin Core should be forced to take on the maintenance burden of maintaining arbitrary limits that we believe are ineffective, and even harmful.

@Rob1Ham
Copy link
Rob1Ham commented Apr 27, 2025

Concept ACK. Better to have provable unspendable outputs than dust outputs forever in the utxo set.

@BitcoinMechanic

This comment was marked as off-topic.

@1ma
Copy link
1ma commented Apr 27, 2025

@petertodd @Rob1Ham Sounds like a new standardness rule is called for, not removing existing ones.

@reardencode
Copy link
reardencode commented Apr 27, 2025

utACK cd7872c

@DrahtBot DrahtBot requested a review from darosior April 27, 2025 22:17
@Retropex
Copy link

@petertodd, if you genuinely want to promote OP_RETURN for the network’s health, this PR should include a similar filter like #28408.

@petertodd
Copy link
Contributor Author

@Retropex Data publishing via unspendable UTXOs is undetectable. We can't block it without significant changes to the consensus protocol.

@jlopp
Copy link
Contributor
jlopp commented Apr 27, 2025

Concept ACK.

It's time to come to terms with the fact that Bitcoin is desirable for some people to use as a data anchor and they will find a way to use it as such, thus we should be asking what the preferred means of data anchoring is and how to incentivize it over less preferable options.

@polespinasa
Copy link
Contributor

I could cACK on removing those limits by default, but:

Also removes the code to enforce those limits, including the -datacarrier and -datacarriersize config options.

Concept nack to this. The code is already there, I don't see the point of taking away users configuration options to their mempool policies.

@luke-jr
Copy link
Member
luke-jr commented Apr 27, 2025

Data publishing via unspendable UTXOs is undetectable. We can't block it without significant changes to the consensus protocol.

This is false. It would require invasive changes to the address format (and disabling old address formats), but there are no consensus changes required.

@petertodd
Copy link
Contributor Author

This is false. It would require invasive changes to the address format (and disabling old address formats), but there are no consensus changes required.

You are referring to standardness rules, which can be easily bypassed Consensus changes are required to have any hope of actually preventing the publication of data.

Besides, deprecating old address formats to "fight spam" --- an enormously costly ecosystem wide change --- is clearly unlikely to happen. Not least of which because it would be ineffective. Similarly, the consensus changes that could in theory prevent most (but not all) data publication are also clearly unlikely to happen.

@petertodd
Copy link
Contributor Author

@polespinasa

I don't see the point of taking away users configuration options to their mempool policies.

We did exactly that with Full-RBF, because the policies of individual nodes are unable to prevent profitable transactions from being broadcast and mined. Full-RBF reached ~100% miner adoption before Bitcoin Core even released a version with it enabled by default.

@nsvrn
Copy link
Contributor
< F438 a class="author Link--primary text-bold css-overflow-wrap-anywhere " show_full_name="false" data-hovercard-type="user" data-hovercard-url="/users/nsvrn/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="/nsvrn">nsvrn commented Apr 27, 2025

Anyway, this kind of non-technical discussion seems off-topic for this pull-req. Best to do it on the mailing list. And furthermore, people have choice - Knots exists. If they wish to enforce these limits on their own nodes they're welcome to run Knots. There's no reason why Bitcoin Core should be forced to take on the maintenance burden of maintaining arbitrary limits that we believe are ineffective, and even harmful.

Concept NACK

There are certainly users who are using these config options(including Core users and on UI of packaged software like Umbrel etc.), removing useful optionality that already exists is completely bogus. What kind of maintenance burden this has been for the "we"(whoever that refers to) as claimed, sounds like a made up claim to achieve your end goal.

@jesterhodl
Copy link
jesterhodl commented Apr 27, 2025

Removing an easy way for bitcoin users to precisely express their preference as to how their node works is like trying to centrally plan an economic system. It will end up in suboptimal results, which nobody will be happy of except non-monetary applications and related beneficiaries.

ML talk references Citrea. The PR seems to anticipate some company's mere intent. Are we now shapeshifting Bitcoin to whatever people publish they might be doing? Bitcoin has a purpose and it's not appeasement.

As a node runner I Concept NACK on both grounds.

@libertyluminary

This comment was marked as abuse.

@1440000bytes
Copy link
1440000bytes commented May 9, 2025

Concept NACK

This makes no difference to PR as explained by Ava in https://x.com/achow101/status/1919467263855300733

@walkjivefly
Copy link

Concept NACK.

There's already more than enough garbage in the Bitcoin chain thanks to the Ordinals Taproot hijacking. There's no need to make it even easier for people to spam the chain.

@aryaethn
Copy link

Concept ACK.
Even though I am not a Bitcoin Developer, I see this of benefit for the ecosystem around Bitcoin, specially DeFi ecosystem, like Bridges, DEXs, and Lending. Removing data limit on OP_RETURN, or at least loosening the limit, provides a way for on-chain communication for DeFi dApps that try to include Bitcoin in DeFi.

@aryaethn
Copy link

Concept NACK.

There's already more than enough garbage in the Bitcoin chain thanks to the Ordinals Taproot hijacking. There's no need to make it even easier for people to spam the chain.

As you know, people pay for every byte of data they post. So how is that spamming when you pay for the so-called "garbage"?

@rstmsn
Copy link
rstmsn commented May 10, 2025

As you know, people pay for every byte of data they post. So how is that spamming when you pay for the so-called "garbage"?

It is reasonably well agreed upon that if there was a highly effective way of excluding things like JPEG data from being stored on the chain, it most certainly would be excluded.

Some users exploit weaknesses in the ability to parse for & successfully exclude this data.

Paying a transaction fee as part of that process does not legitimise the behaviour.

@aryaethn
Copy link

It is reasonably well agreed upon that if there was a highly effective way of excluding things like JPEG data from being stored on the chain, it most certainly would be excluded.

I understand and agree that heavy data is "better" to be stored some where else, with a link or other ways like that. Maybe on a file storage chain, or ipfs-based blockchains. No force, just better to be.

Some users exploit weaknesses in the ability to parse for & successfully exclude this data.
Paying a transaction fee as part of that process does not legitimise the behaviour.

I know it doesn't legitimize the action, but it does not break any rules. Bitcoin's consensus is based on "Pay more fee to get included first". They pay the fee, and miner's are happy. I don't see the problem.

@rstmsn
Copy link
rstmsn commented May 10, 2025

I know it doesn't legitimize the action, but it does not break any rules. Bitcoin's consensus is based on "Pay more fee to get included first". They pay the fee, and miner's are happy. I don't see the problem.

Exploiting the system to use it as cloud data storage pushes up the cost of running a node. The decentralisation characteristic is an emergent property of many users running and using their own nodes.
To the extent that this aspect becomes more expensive, the decentralisation characteristic will decline.
This unavoidably puts the long term health & success of the project at risk.

This debate is currently divided along the lines of those who believe we should remain proactive towards spam filtration, despite the lack of a 100% effective solution, vs those who believe the absence of a technically perfect solutions means we should capitulate to spammers, and remove all efforts to slow down / reduce spam.

@pithosian
Copy link
pithosian commented May 10, 2025

NACK.

I've already made comments in the mailing list. To summarize: this change is not necessary for nonstandard OP_RETURN transactions to be relayed to miners via public mempools, nor is it necessary to prevent delays in relay of compact blocks. That said, I don't have strong feelings about removing OP_RETURN configuration limits, or default configuration changes.

However, removing the datacarrier and datacarriersize options has not been well justified. Downloading a transaction twice is not the same as downloading it once and uploading it N times. Even if N is 1. Download and upload bandwidth are separate, and on many residential networks, as well as cellular, upload bandwidth is significantly lower. My upload speed is 1/30th of my download speed. Removing these flags will not make mempools more consistent. It will push (and is already pushing) more users to run alternative clients which respect their right to control what the code running on their computer does.

Running a node is mandatory to interact with Bitcoin without intermediaries. It's mandatory if you want to validate the chain yourself. Having a mempool is useful to the individual noderunner. Justifying a change to the reference implementation by pointing out that 'other clients exist', when other clients exist which already have the change you're attempting to justify is a hypocritical non-argument. There's already a minimum relay option (blocksonly), and a maximum relay option (librerelay). The only interesting work to do regarding relay policy is to give users more control over their own resources, not remove existing options for no good reason.

Changes to the reference implementation increase the maintenance burden of downstream alternatives which don't want to mirror that change, as well as the review burden of every user who wants to check that downstream's changes in the future. These flags haven't cost Core anything over the past decade. A reference implementation should not make frivolous changes, and the removal of these flags is the definition of frivolous.

All of my points apply equally to #32406, which is the same change, but delaying removal of these flags by first deprecating them, so in a few releases they can be removed with the justification being "they're deprecated".

Edit because it doesn't warrant posting another comment: please refrain from posting LLM generated spam anywhere, let alone Bitcoin development discussion channels (eg: #32359 (comment)). It's transparently not written by a human, adds absolutely nothing to the conversation, is riddled with self-contradictory statements and factual errors, and discourages good-faith dialogue. People want to hear your thoughts, not the output of predictive text with too much hardware thrown at it.

@aryaethn
Copy link

This debate is currently divided along the lines of those who believe we should remain proactive towards spam filtration, despite the lack of a 100% effective solution, vs those who believe the absence of a technically perfect solutions means we should capitulate to spammers, and remove all efforts to slow down / reduce spam.

I agree. It's a debate that no one wins by argument. Only the majority decides the winner.

@walkjivefly
Copy link

Concept NACK.
There's already more than enough garbage in the Bitcoin chain thanks to the Ordinals Taproot hijacking. There's no need to make it even easier for people to spam the chain.

As you know, people pay for every byte of data they post. So how is that spamming when you pay for the so-called "garbage"?

The garbage transactions increase chain size, block occupancy and fees, and can make it harder and more expensive to get financial transactions onto the chain.

They also expose node operators to the dangers of "illegal" content on their machines.

True, the spammers pay for every byte they put on chain but arbitrary non-financial content is not what Bitcoin was designed for. Remember the first line of the Bitcoin White Paper:

Abstract. A purely peer-to-peer version of electronic cash would allow online
payments to be sent directly from one party to another without going through a
financial institution.

@walkjivefly
Copy link

I know it doesn't legitimize the action, but it does not break any rules. Bitcoin's consensus is based on "Pay more fee to get included first". They pay the fee, and miner's are happy. I don't see the problem.

Also true; miners are always going to be happy with more fees. But although they're essential to the operation of the network they represent only a very small percentage of the actual nodes in the network. They are not the only ones who need to be happy.

@aryaethn
Copy link

True, the spammers pay for every byte they put on chain but arbitrary non-financial content is not what Bitcoin was designed for. Remember the first line of the Bitcoin White Paper:

Abstract. A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution.

I generally agree with your argument with some specification. The current OP_RETURN size, makes even financial (DeFi) transaction face issues. Almost all bridges have problems, meaning the whole DeFi ecosystem lacks good Bitcoin inclusion. With a lower limit or removed limit, the whole DeFi can have the benefit of Bitcoin's enormous market cap.

@1440000bytes
Copy link

This pull request won't get merged. If you really have any point that is worth considering please do it in #32406

I might get banned for this comment because Bitcoin Core contributors cannot read disagreements or dissent here or on other forums.

However, I would like to read your comments in #32406 if they make sense.

@pokrovskyy
Copy link

Meta-protocols builders will find ways to store data on-chain regardless of the decision here. Currently it's in witness data. So this discussion does not solve any raised questions on storing data and its legality etc. However, enabling data in OP_RETURN does offer a number of benefits. Eg.:

  • future meta-protocols may choose OP_RETURN as the storage option for their data (instead of witness)
  • this will decrease the data inflow into witness
  • consequently, this will decrease the growth rates of the UTXO set, helping with node running issues presented above
  • additionally, it's 4x more expensive, so more miner benefits

Hope this makes sense!

@va7wv

This comment was marked as off-topic.

< 3D11 /div>
@yope
Copy link
yope commented May 11, 2025

Meta-protocols builders will find ways to store data on-chain regardless of the decision here. Currently it's in witness data. So this discussion does not solve any raised questions on storing data and its legality etc. However, enabling data in OP_RETURN does offer a number of benefits. Eg.:

I agree to this. But I don't think there are only benefits...

  • future meta-protocols may choose OP_RETURN as the storage option for their data (instead of witness)

And future undesirable meta-protocols may chose to put their data in OP_RETURN when they otherwise would have stayed away entirely.

  • this will decrease the data inflow into witness

How exactly? The witness is still cheaper.

  • consequently, this will decrease the growth rates of the UTXO set, helping with node running issues presented above

There are cases to be made for users that might chose OP_RETURN to not bloat the UTXO set, but it won't help to avoid a bad actor to do it anyway. We also know that citrea doesn't care enough about this to change their protocol in retrospect. Would someone like citrea have chosen OP_RETURN if it had no limit in bitcoin core (it still has in other implementations)? We can only speculate. So this is speculation, equally as the idea of getting more meta-protocols using bitcoin if OP_RETURN is limitless, is also speculation. Which argument weighs stronger? I think this is not something for this project to decide. If bitcoin core wants to remain neutral, it should remain neutral on the configurability of this limit also IMHO.

  • additionally, it's 4x more expensive, so more miner benefits

This just contradicts point 2 AFAICS. If it is more expensive than the witness space why would any protocol chose OP_RETURN if they had a choice?

Hope this makes sense!

Halfways. I think you are leaving out important nuances and also potential negative effects of this.
One other game theoretical aspect I'd like to point out is the fact that miners can be considered mostly aligned with bitcoin's success, since bitcoin price is part of their bottom line. This is relevant because the fear of mining centralization has often been brought up. Meta-protocol developers OTOH have no incentive at all to make bitcoin successful in its monetary function. Given these facts, it is strange to on the one hand argue that meta-protocol developers in general will "play nice" by using OP_RETURN instead of less desirable or harmful alternatives and on the other hand assume mining centralization is a given outcome. It's both speculation and IMHO neutrality is to not merge this PR.

@ranathan14

This comment was marked as spam.

@Cyberwiz9000
Copy link
Cyberwiz9000 commented May 11, 2025

Approach NACK

I can understand the technical reasons for lifting the OP_RETURN data limit, but I oppose the removal of node configurability. Both -datacarrier and -datacarriersize should be retained. Node operators should be able to choose what kind of data they relay without having to rely on other Node implementations that have less developper support.

Edit: Also, wether it is actually virtue signalling or not, a Node runner should be able to choose not to relay certain data, it is their node, and not activily aiding in the propagation of certain data, which could be considered spam or otherwise unwanted, should be in their right.

Additionally, I object to allowing multiple OP_RETURN outputs per transaction. This may increase mempool complexity, raise attack surface, and introduce ambiguity in interpreting transactions. I suggest limiting transactions to a single `OP_RETURN'.

Edit: And in the mailing list it was suggested that initially only the data limit would be removed.

As there is substantial opposition and no urgency for these changes, the Bitcoin Core policy should be conservative by default,

Edit: Finally, even though these aren't consensus changes, if the Core maintainers can unilaterally push through these changes, without widespread support from its users (node runners), then that becomes an attack surface in itself.

@liviu-liviu
Copy link

Concept NACK We should create more (and possibly easier) avenues that help node operators express their will. This PR is moving us in the opposite direction.

@l0rinc
Copy link
Contributor
l0rinc commented May 11, 2025

I think we should have an option (default on) to obfuscate the block data files on startup if they are not yet obfuscated.

For the record, @andrewtoth pushed a separate tool to fix this issue since, see: #32451


As for removing the limit, I don't think it makes any meaningful difference either way; this discussion was way overblown for some reason. My only objection was how aggressively this was pushed - we should have started with a long and patient education phase instead of suggesting urgency for a change that ultimately doesn't matter that much (we can likely come up with similar fixes for objections on both sides, like the obfuscation tool above, if needed).

@juanitoddd
Copy link

Concept NACK

As a node runner, removing this configurations limits my sovereignty over what my node relays.

@donaldevinev1
Copy link

Concept NACK - removing these limits is like saying 'because some cars are speeding we should remove the speed limit on all the roads'.

@pinheadmz
Copy link
Member

Concept NACK - removing these limits is like saying 'because some cars are speeding we should remove the speed limit on all the roads'.

It's more like "some cars are driving idiotically slow so we'll allow them to drive on the shoulder so they bother less people"

@Hackzero00
Copy link

Concept NACK. Removing these limits formalizes surrender to protocol abuse. The fact that data spam circumvents config options doesn't justify abandoning enforcement. Bitcoin is a monetary protocol, not a general-purpose data store. Normalizing arbitrary payloads weakens decentralization, bloats the chain, and betrays the principle of node sovereignty. The base layer must remain focused, lean, and resistant to parasitic use. This change undermines that.

@glozow
Copy link
Member
glozow commented May 12, 2025

Based on the discussion, there seems to be consensus around leaving the config options in place. That approach is implemented in #32406, so closing this PR. (Also: please do not jump to conclusions about whether this means the other PR will be merged/closed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0