[bitcoin-dev] [patch] Switching Bitcoin Core to sqlite db

Discussion:

Jeff Garzik via bitcoin-dev

2015-10-22 21:26:42 UTC

Here is the beginnings of an implementation to replace leveldb with sqlite:
https://github.com/jgarzik/bitcoin/tree/2015_sqlite

It builds, but still needs work before passing tests.

It was noted that leveldb is unmaintained, and this is part of researching
alternatives that are maintained and reliable.

Joseph Gleason ⑈ via bitcoin-dev

2015-10-22 21:56:18 UTC

Permalink

I have done a lot of recent work on local key value stores, mostly for a
java electrum server I am working on.

I'd suggest considering LMDB. One downside is that it is memory mapped so
32-bit systems that need over 2gb of storage are right out. Other than
that, it is quite fast and seems reliable in my testing.

On Thu, Oct 22, 2015 at 2:37 PM Jeff Garzik via bitcoin-dev <

Post by Jeff Garzik via bitcoin-dev
Here is the beginnings of an implementation to replace leveldb with
sqlite: https://github.com/jgarzik/bitcoin/tree/2015_sqlite
It builds, but still needs work before passing tests.
It was noted that leveldb is unmaintained, and this is part of researching
alternatives that are maintained and reliable.
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Patrick Strateman via bitcoin-dev

2015-10-22 21:54:14 UTC

Permalink

Benchmarks?

I cant imagine that's very fast.

Post by Jeff Garzik via bitcoin-dev
Here is the beginnings of an implementation to replace leveldb with
sqlite: https://github.com/jgarzik/bitcoin/tree/2015_sqlite
It builds, but still needs work before passing tests.
It was noted that leveldb is unmaintained, and this is part of
researching alternatives that are maintained and reliable.
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Jonas Schnelli via bitcoin-dev

2015-10-23 06:53:20 UTC

Permalink

Nice work!

Although not sure if we should focus directly on sqlite4 (could be
optional with a configure flag and a subtree [until stable], sqlite3
supported over depends).

Also i personally would recommend to not implement it as a
replacement, instead, support multiple backends (wrapper header /
different wrapper implementations [leveldb / sqlite3 / sqlite4]). But
– agreed – should not be the focus, but a nice additional flexibility
if it not require much more work.
And – in theory – multiple database back-ends would allow migration.

Before investigating to deep, i think we need a dbwrapper bench tool
that represents our needs and our style how we hit the database.
Gavins recently added bench target / bench environment could be used
for that.

/jonas

Lucas Betschart via bitcoin-dev

2015-10-23 07:45:09 UTC

Permalink

Facebook has a LevelDB fork which is maintained.
It's called RocksDB and the API seems to be nearly the same as for LevelDB,
thus maybe easy to replace: http://rocksdb.org/
https://github.com/facebook/rocksdb

Although I don't know if we might have some negative effects for our
use-case since RocksDB was optimized for big databases running on multiple
cores.

2015-10-22 23:26 GMT+02:00 Jeff Garzik via bitcoin-dev <

Sean Lynch via bitcoin-dev

2015-10-28 20:28:00 UTC

Permalink

On Fri, Oct 23, 2015 at 1:23 AM Lucas Betschart via bitcoin-dev <

Post by Lucas Betschart via bitcoin-dev
Facebook has a LevelDB fork which is maintained.
It's called RocksDB and the API seems to be nearly the same as for
LevelDB, thus maybe easy to replace: http://rocksdb.org/
https://github.com/facebook/rocksdb
Although I don't know if we might have some negative effects for our
use-case since RocksDB was optimized for big databases running on multiple
cores.

While RocksDB is pretty decent, note that it's optimized for flash. Not
sure how well it will work on spinning disks.

Jeff Garzik via bitcoin-dev

2015-10-28 21:11:47 UTC

Permalink

Post by Sean Lynch via bitcoin-dev
On Fri, Oct 23, 2015 at 1:23 AM Lucas Betschart via bitcoin-dev <

While RocksDB is pretty decent, note that it's optimized for flash. Not
sure how well it will work on spinning disks.

That's OK for our purposes. We have a huge database which already
incentivized having zero seek time.

Tom Zander via bitcoin-dev

2015-10-23 10:30:37 UTC

Permalink

Post by Jeff Garzik via bitcoin-dev
It was noted that leveldb is unmaintained, and this is part of researching
alternatives that are maintained and reliable.

Apart from it being unmaintained, any links to what are problems with levelDB?

Douglas Roark via bitcoin-dev

2015-10-26 18:06:56 UTC

Permalink

Post by Tom Zander via bitcoin-dev

Post by Jeff Garzik via bitcoin-dev
It was noted that leveldb is unmaintained, and this is part of researching
alternatives that are maintained and reliable.

Apart from it being unmaintained, any links to what are problems with levelDB?

While not exactly the most rigorous link,
https://en.wikipedia.org/wiki/LevelDB#Bugs_and_Reliability seems like an
okay place to start. One thing I can attest to is that, when Armory used
LevelDB (0.8 - 0.92, IIRC), quite a few users had DB corruption issues,
particularly on Windows. Even when a switch to LMDB occurred for 0.93,
loads of complaints would come in from users whose LevelDB-based Core
DBs would fail. I know that the guy who moved Armory over to LMDB would
love to have more time in the day so that he could write a Core patch
that does the same. It's a very sore spot for him.

(FWIW, LMDB seems to work quite nicely, at least once you patch up the
source a little bit. The latest version is also compatible with Core's
cross-compiling scheme. I'd love to see it added to Core one day.)

Doug

Tom Zander via bitcoin-dev

2015-10-28 15:52:53 UTC

Permalink

Post by Douglas Roark via bitcoin-dev
While not exactly the most rigorous link,
https://en.wikipedia.org/wiki/LevelDB#Bugs_and_Reliability seems like an
okay place to start.

Thanks for that link!

Another Google open source product I'll avoid like the plague ;)

Jonathan Wilkins via bitcoin-dev

2015-11-18 00:06:44 UTC

Permalink

Benchmarks for various DBs under discussion:
http://symas.com/mdb/microbench/

On Mon, Oct 26, 2015 at 11:06 AM, Douglas Roark via bitcoin-dev <

Post by Jeff Garzik via bitcoin-dev

Post by Tom Zander via bitcoin-dev

Post by Jeff Garzik via bitcoin-dev
It was noted that leveldb is unmaintained, and this is part of

researching

Post by Tom Zander via bitcoin-dev

Post by Jeff Garzik via bitcoin-dev
alternatives that are maintained and reliable.

Apart from it being unmaintained, any links to what are problems with

levelDB?
While not exactly the most rigorous link,
https://en.wikipedia.org/wiki/LevelDB#Bugs_and_Reliability seems like an
okay place to start. One thing I can attest to is that, when Armory used
LevelDB (0.8 - 0.92, IIRC), quite a few users had DB corruption issues,
particularly on Windows. Even when a switch to LMDB occurred for 0.93,
loads of complaints would come in from users whose LevelDB-based Core
DBs would fail. I know that the guy who moved Armory over to LMDB would
love to have more time in the day so that he could write a Core patch
that does the same. It's a very sore spot for him.
(FWIW, LMDB seems to work quite nicely, at least once you patch up the
source a little bit. The latest version is also compatible with Core's
cross-compiling scheme. I'd love to see it added to Core one day.)
Doug
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

telemaco via bitcoin-dev

2015-10-29 06:57:39 UTC

Permalink

Why not allow two options:

1/ a default RocksDB/SQLite/LevelDB (whatever is decided)
2/ alternative provide instructions for connection to any other rdbms
using odbc or jdbc.

Why not allowing async disk writes or incredibly fast database systems
if someone wants to have a node in a very fast datacenter or connected
with their existing leveraged dataservers. It is the traditional
approach to just use the open standard for database connectivity.

Any person or any organization would just need to have one machine with
their bitcoin node with a rdbms client installed (SAP Sybase client, or
oracle client, or microsoft). The bitcoin node would just store their
data using the odbc/jdbc protocol on ANY rdbms installed anywhere in
their organization (other machine or the same). They would just need to
issue a "create table" with a very simple table structure and they would
benefit from async and indexes and using their already licensed, and
configured system of their choosing, with bitcoin information being
available to thousands of software packages and available aswell to
thousands of programmers that work with rdbms and not just "RocksDB" or
some obscure database system.

Why not "outsource" totally that data management part to the already
existing with decades of experience database world. People would be able
to create incredibly easy bitcoin statistics/graphs/analisys with
existing software packages (hey even excel or libreoffice like) or
connect bitcoin data to their own sources and if so they chose analyze
bitcoin data on a datawarehouse or any imaginable approach. Of course
every transaction would be have to do through the bitcoin node and only
the data management would be on rdbms side.

Luke Dashjr via bitcoin-dev

2015-10-29 08:03:50 UTC

Permalink

Post by telemaco via bitcoin-dev
1/ a default RocksDB/SQLite/LevelDB (whatever is decided)
2/ alternative provide instructions for connection to any other rdbms
using odbc or jdbc.

I predict this would be a disaster. UTXO storage is CONSENSUS-CRITICAL code.
Any divergence in implementation behaviour, including bugs AND bugfixes, may
cause consensus failure. For this to have a reasonable *hope* of working, we
need to choose one storage engine, and *will* need to maintain consensus-
compatibility of it ourselves (since nobody else cares).

Fixing LevelDB frankly seems like an easier task than switching to anything
SQL-based, which would require a *lot* more *difficult-to-get-consensus-
compatible* code that we are all (or at least mostly) very unfamiliar with.

Research is fine, but let's be realistic about deployment.

Luke

Simon Liu via bitcoin-dev

2015-10-30 03:04:19 UTC

Permalink

Storage of UTXO data looks like an implementation detail and thus one
would have thought that the choice of database would not increase the
odds of consensus protocol failure.

Btcd, a full node implementation written in Go, already provides a
database interface which supports different backends:

https://github.com/btcsuite/btcd/tree/master/database

Given that UTXO storage is considered critical, it seems reasonable to
let a node operator decide for themselves if they want data stored in
LevelDB (which is not fully ACID compliant) or a database like Sqlite,
Oracle, DB2 etc.

If the storage requirements for UTXO data are fairly simple, consisting
mainly of puts and gets, there is a decent argument that using a
dedicated key-value store provides superior performance over a
traditional SQL database.

However, from a practical perspective, given that nodes operate on a
range of different hardware and even a little Raspberry Pi can run a
full node and keep up with the network, why not let those users with the
resources to operate big iron databases do so? It would be a good
feature to have.

Post by Luke Dashjr via bitcoin-dev
I predict this would be a disaster. UTXO storage is CONSENSUS-CRITICAL code.
Any divergence in implementation behaviour, including bugs AND bugfixes, may
cause consensus failure. For this to have a reasonable *hope* of working, we
need to choose one storage engine, and *will* need to maintain consensus-
compatibility of it ourselves (since nobody else cares).
Fixing LevelDB frankly seems like an easier task than switching to anything
SQL-based, which would require a *lot* more *difficult-to-get-consensus-
compatible* code that we are all (or at least mostly) very unfamiliar with.
Research is fine, but let's be realistic about deployment.
Luke
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Gregory Maxwell via bitcoin-dev

2015-10-30 03:35:33 UTC

Permalink

On Fri, Oct 30, 2015 at 3:04 AM, Simon Liu via bitcoin-dev

Post by Simon Liu via bitcoin-dev
Given that UTXO storage is considered critical, it seems reasonable to

This sounds like a misunderstanding of what consensus criticial means.
It does not mean that it must be right (though obviously that is
preferable) but that it must be _consistent_, between all nodes.

Post by Simon Liu via bitcoin-dev
full node and keep up with the network, why not let those users with the
resources to operate big iron databases do so? It would be a good
feature to have.

Because it provides no value, the data is opaque and propritarily
encoded with a compression function which we may change from version
to version, and because many of these alternatives are enormously
slow; enough that they present problems with falling behind the
network even on high performance hardware.

Moreover, additional functional which will not be sufficiently used
will not adequately maintained and result in increased maintains costs
and more bugs.

Gregory Maxwell via bitcoin-dev

2015-10-30 04:28:47 UTC

Permalink

Can you give a specific example of how nodes that used different database technologies might determine different answers to whether a given transaction is valid or invalid? I’m not a database expert, but to me it would seem that if all the unspent outputs can be found in the database, and if the relevant information about each output can be retrieved without corruption, then that’s all that really matters as far as the database is concerned.

If you add to those set of assumptions the handling of write ordering
is the same (e.g. multiple updates in an change end up with the same
entry surviving) and read/write interleave returning the same results
then it wouldn't.

But databases sometimes have errors which cause them to fail to return
records, or to return stale data. And if those exist consistency must
be maintained; and "fixing" the bug can cause a divergence in
consensus state that could open users up to theft.

Case in point, prior to leveldb's use in Bitcoin Core it had a bug
that, under rare conditions, could cause it to consistently return not
found on records that were really there (I'm running from memory so I
don't recall the specific cause). Leveldb fixed this serious bug in a
minor update. But deploying a fix like this in an uncontrolled manner
in the bitcoin network would potentially cause a fork in the consensus
state; so any such fix would need to be rolled out in an orderly
manner.

I’d like a concrete example to help me understand why more than one implementation of something like the UTXO database would be unreasonable.

It's not unreasonable, but great care is required around the specifics.

Bitcoin consensus implements a mathematical function that defines the
operation of the system and above all else all systems must agree (or
else the state can diverge and permit double-spends); if you could
prove that a component behaves identically under all inputs to another
function then it can be replaced without concern but this is something
that cannot be done generally for all software, and proving
equivalence even in special cases it is an open area of research. The
case where the software itself is identical or nearly so is much
easier to gain confidence in the equivalence of a change through
testing and review.

With that cost in mind one must then consider the other side of the
equation-- utxo database is an opaque compressed representation,
several of the posts here have been about desirability of blockchain
analysis interfaces, and I agree they're sometimes desirable but
access to the consensus utxo database is not helpful for that.
Similarly, other things suggested are so phenomenally slow that it's
unlikely that a node would catch up and stay synced even on powerful
hardware. Regardless, in Bitcoin core the storage engine for this is
fully internally abstracted and so it is relatively straight forward
for someone to drop something else in to experiment with; whatever the
motivation.

I think people are falling into a trap of thinking "It's a <database>,
I know a <black box> for that!"; but the application and needs are
very specialized here; no less than, say-- the table of pre-computed
EC points used for signing in the ECDSA application. It just so
happens that on the back of the very bitcoin specific cryptographic
consensus algorithim there was a slot where a pre-existing high
performance key-value store fit; and so we're using one and saving
ourselves some effort. If, in the future, Bitcoin Core adopts a
merkelized commitment for the UTXO it would probably need to stop
using any off-the-shelf key value store entirely, in order to avoid a
20+ fold write inflation from updating hash tree paths (And Bram Cohen
has been working on just such a thing, in fact).

Gregory Maxwell via bitcoin-dev

2015-11-15 03:30:45 UTC

Permalink

I think you’re being intentionally obtuse here: accepting a block composed entirely of valid transactions that is 1.1 MB is entirely different than accepting a TX that creates a ten thousand bitcoins out of thin air. The market would love the former but abhor the later. I believe you can recognize the difference.

It is not technically distinct--today; politically-- perhaps, but--
sorry, no element of your prior message indicated that you were
interested in discussing politics rather than technology; on a mailing
list much more strongly scoped for the latter; I hope you can excuse
me for missing your intention prior to your most recent post.

That said, I believe you are privileging your own political
preferences in seeing the one rule of the bitcoin system as
categorically distinct even politically. No law of nature leaves the
other criteria I specified less politically negotiable, and we can see
concrete examples all around us -- the notion that funds can be
confiscated via external authority (spending without the owners
signature) is a more or less universal property of other modern
systems of money, that economic controls out to exist to regulate the
supply of money for the good of an economy is another widely deployed
political perspective. You, yourself, recently published a work on the
stable self regulation of block sizes based on mining incentives that
took as its starting premise a bitcoin that was forever inflationary.
Certainly things differ in degrees, but this is not the mailing list
to debate the details of political inertia.

Thank you for conceding on that point.

You're welcome, but I would have preferred that you instead of your
thanks you would have responded in kind and acknowledged my correction
that other consensus inconsistencies discovered in implementations
thus far (none, that I'm aware of) could be classified as "maybe"; and
in doing so retained a semblance of a connection to a the technical
purposes of this mailing list.

Tamas Blummer via bitcoin-dev

2015-11-17 13:54:19 UTC

Permalink

Isolating storage from the rest of consensus code is technically desirable, but implementations using different storage will be unlikely bug-for-bug compatible,
hence able to split the network.

Such split was disastrous on the network level if partitions were of comparable magnitude - as was the case in the March 2013 fork between versions of Bitcoin Core.

This means high level implementation diversity was great, provided we would get there without blowing up the network on the way from monoculture to true decentralization of code.

Libconsensus is immensely valuable to get diversity, as it makes alternate implementations bug-for-bug compatible for a big part of the consensus code.

Tamas Blummer

Tom Harding via bitcoin-dev

2015-11-17 15:24:42 UTC

Permalink

On Nov 17, 2015 5:54 AM, "Tamas Blummer via bitcoin-dev" <

Post by Tamas Blummer via bitcoin-dev
Isolating storage from the rest of consensus code is technically

desirable, but implementations using different storage will be unlikely
bug-for-bug compatible,

Post by Tamas Blummer via bitcoin-dev
hence able to split the network.

The problem with unknown bugs is you don't know how serious they are. A
serious bug could itself be devastating.

telemaco via bitcoin-dev

2015-11-17 22:17:33 UTC

Permalink

Shouldn't a odbc jdbc jconnect or equivalent be totally transparent for the consensus code? I mean, the client would write or store the data communicating to the driver provided by the vendor. Using the schema bitcoin suggests adapted to many different vendors (one table schema for Oracle, other for mysql, etc with their slight syntax particularities), installed in the machine with the node and from that communication to the driver the storage would be totally controlled by the third party rdbms.
Regarding bugs or risk of fork, does not have actual client any defense against someone forking core and slightly changing the actual database used maybe wrongly and creating a fork by themselves?
Does the client have any way to verify that what is stored is correct? Maybe inserting a column with a hash of what is stored in each row and another column with a incremental row by row hash composed by the hash of each row and the previous column one., so any tampering in a previous row can be verified up to where is not consistent.
I just imagine what would be for people to be able to access easily (with the thousands of software packages already bought and licensed by ALL companies in the world that already use open standard connectivity or equivalents)., the bitcoin blockchain.
SUBSCRIPTION: for a couple decades replication servers have allowed a publish/subscription model using replication agents. If I am a guy working on a lever in the warehouse with my pda I do not need on my pda all the company info or maybe all the blockchain. If a company., that has already licensed a rdbms package with dozens of related software packages needs one guy to suscribe to something on the bitcoin blockchain, he can either use one of the purchased methods in their company and access the company database that holds blockchain data or hire a rare bitcoin developer that will create a interfaz bitcoin for a specific need up to the millions of needs out there.
PUBLISHING Maybe even to have a publishing daemon that would allow those companies and their software packages to write things in the bitcoin blockchain provided of couse that they fund the agent with a small bitcoin amount to send transactions and they comply with the database constraint of being the owners of the private key. The publishing agent would check for changes every X minutes on that specific address in the db and if funded it would publish "send" the transaction through the bitcoin client. People would be able to publish info on the decentralized ledger from 90% of enterprise software packages.,paying ofc and with the small delay of the publishing agent checking for changes. In fact the db would allow publishing info while the publishing agent could just take its time publishing at its own rate like a slow write cache.
In any case shouldn't even actual consensus be shielded from a malfunctioning or Ill forked database from core client

Post by Tom Harding via bitcoin-dev
On Nov 17, 2015 5:54 AM, "Tamas Blummer via bitcoin-dev" <

Post by Tamas Blummer via bitcoin-dev
Isolating storage from the rest of consensus code is technically

desirable, but implementations using different storage will be unlikely
bug-for-bug compatible,

Post by Tamas Blummer via bitcoin-dev
hence able to split the network.

The problem with unknown bugs is you don't know how serious they are.
A
serious bug could itself be devastating.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Jorge Timón via bitcoin-dev

2015-11-20 14:15:20 UTC

Permalink

On Tue, Nov 17, 2015 at 11:17 PM, telemaco via bitcoin-dev

Post by telemaco via bitcoin-dev
Shouldn't a odbc jdbc jconnect or equivalent be totally transparent for the
consensus code?

Yes, but we're only testing levelDB and we couldn't assure that it
won't produce unintentional consensus forks with other databases
behind the whatever db-agnostic interface.
I believe Bitcoin Core should officially support only one database at
a time. And if that is to change in the future, I don't think it
should be before a storage-agnostic libconsensus is encapsulated (and
after that there will still be risks and costs in officially
supporting several several databases simultaneously).
As has been said, these kind of experiments are welcomed outside of
bitcoin/master though.

Rusty Russell via bitcoin-dev

2015-11-16 01:52:28 UTC

Permalink

You are looking at the problem from a “top down” governance
perspective assuming you know what code is actually being run and what
rules the market wants.

We have strayed far from both the Subject line and from making progress
on bitcoin development. Please redirect to bitcoin-discuss.

I have set the moderation bits on the three contributors from here down
(CC'd): your next post will go to moderation.

Thanks,
Rusty.

Gregory Maxwell via bitcoin-dev

2015-10-29 08:17:27 UTC

Permalink

On Thu, Oct 29, 2015 at 6:57 AM, telemaco via bitcoin-dev

Post by telemaco via bitcoin-dev
Why not "outsource" totally that data management part to the already
existing with decades of experience database world. People would be able to
create incredibly easy bitcoin statistics/graphs/analisys with existing
software packages (hey even excel or libreoffice like) or connect bitcoin
data to their own sources and if so they chose analyze bitcoin data on a
datawarehouse or any imaginable approach. Of course every transaction would
be have to do through the bitcoin node and only the data management would be
on rdbms side.

The word "database" is likely confusing people here. This is not a
database in an ordinary sense.

The bitcoin core consensus engine requires a highly optimized ultra
compact data structure to perform the lookups for coin existence. The
data stored is highly compressed and very specialized, it would not be
useful to other applications. Right now, on boring laptop hardware,
during network synchronization updates to this database run at over
10,000 records per second, while the system is also busy doing the
other validation chores of a node. This is backended by a high
performance transactional key value store. The need for performance
here is essential to even keeping up with the network, it's not about
enabling any kind of fancy querying (bitcoin core does not offer fancy
querying), it's about the base load that every node must handle to
usably sync up and keep up with the Bitcoin network.

The backend can be swapped out for something else that provides the
same properties, but doing so does not give you any of the
inspection/analytics that you're looking for. Systems that do that
exist, and they require databases taking hundreds of gigabytes of
storage and take days to weeks to import the network data. They're
great for what they're for, but they're not suitable for consensus use
in the system for space efficiency, performance, and consensus
consistency reasons.