Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I quickly import rdb data? #412

Open
chunjieyaya opened this issue May 23, 2024 · 12 comments
Open

Can I quickly import rdb data? #412

chunjieyaya opened this issue May 23, 2024 · 12 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@chunjieyaya
Copy link

chunjieyaya commented May 23, 2024

Feature request type

sample request

Is your feature request related to a problem? Please describe

After saving the existing redis as rdb, I want to import it into garnet

Describe the solution you'd like

You can import them by running commands

Describe alternatives you've considered

No response

Additional context

No response

@badrishc
Copy link
Contributor

Workaround would be to load the rdb into redis, scan the DB, and issue set operations on Garnet.

@chunjieyaya
Copy link
Author

Workaround would be to load the rdb into redis, scan the DB, and issue set operations on Garnet.

Can the follow-up plan do this function?

@darrenge darrenge added the question Further information is requested label May 24, 2024
@vazois
Copy link
Contributor

vazois commented Jun 3, 2024

Workaround would be to load the rdb into redis, scan the DB, and issue set operations on Garnet.

Can the follow-up plan do this function?

This a good idea, to have an rdb-importer for Garnet. Unfortunately, it will require constant maintenance to keep up with any updates to the rdb format.
I actually did a quick research and saw that there exist many different parsers for parsing, formatting and analyzing rdb files.
However, all of them do not seem to have support for newer redis versions.
I am adding here some of them for reference.

If we had a robust C# rdb parser then we can use the internal Garnet API to convert the data into upsert commands.
Similar to what rdb-rs is doing with the conversion to RESP protocol.
Bottomline, this is an issue that should be marked as help is needed.
If someone wants to take a stub at it, I am happy to provide some guidance/feedback.

https://github.com/SamuelFisher/CLRdb
https://github.com/badboy/rdb-rs
https://github.com/redis/librdb
https://github.com/HDT3213/rdb

https://rdb.fnordig.de/file_format.html
https://github.com/redis/redis/blob/unstable/src/rdb.h

@vazois vazois added the help wanted Extra attention is needed label Jun 3, 2024
@chunjieyaya
Copy link
Author

Workaround would be to load the rdb into redis, scan the DB, and issue set operations on Garnet.

Can the follow-up plan do this function?

This a good idea, to have an rdb-importer for Garnet. Unfortunately, it will require constant maintenance to keep up with any updates to the rdb format. I actually did a quick research and saw that there exist many different parsers for parsing, formatting and analyzing rdb files. However, all of them do not seem to have support for newer redis versions. I am adding here some of them for reference.

If we had a robust C# rdb parser then we can use the internal Garnet API to convert the data into upsert commands. Similar to what rdb-rs is doing with the conversion to RESP protocol. Bottomline, this is an issue that should be marked as help is needed. If someone wants to take a stub at it, I am happy to provide some guidance/feedback.

https://github.com/SamuelFisher/CLRdb https://github.com/badboy/rdb-rs https://github.com/redis/librdb https://github.com/HDT3213/rdb

https://rdb.fnordig.de/file_format.html https://github.com/redis/redis/blob/unstable/src/rdb.h

Very good

@PaulusParssinen
Copy link
Contributor

PaulusParssinen commented Jun 4, 2024

I can write a robust and small RDB (de)serialization .NET library as 3rd-party library in upcoming weeks and publish it on NuGet. Would this work for Garnet?

@vazois
Copy link
Contributor

vazois commented Jun 4, 2024

I can write a robust and small RDB (de)serialization .NET library as 3rd-party library in upcoming weeks and publish it on NuGet. Would this work for Garnet?

I think that would be great idea to start with something as a 3rd-party library and work similarly to what the other projects are doing (i.e. have an rdb to RESP protocol parser or json file) but of course being more up to date.
If it is native C# it will be cool to integrate to Garnet and use the internal GarnetAPI and load the rdb file during recovery phase.
Though I would suggest doing the parser first since this is the major ticket item.
The problem I found while researching for this (and admittedly did not spend too much on it) is that there are no official docs for of rdb format.
The only way I see it is to reverse engineer it from the redis codebase.
I posted some relevant link about this also.
As you find more information it will be great to post here

@PaulusParssinen
Copy link
Contributor

PaulusParssinen commented Jun 4, 2024

Reverse engineering file-formats is right up my alley. This is already much more easier already because I have some source to look at, which normally isn't the case for me 😄 (I can reverse protocols/file-formats out of a black-box if need be)

I will of course only look at Redis souce pre license change (I have a fork frozen at last commit before the license change e64d91c). And yea, parser first of course.

@PaulusParssinen
Copy link
Contributor

PaulusParssinen commented Jun 20, 2024

By the looks of it, I may have a time slot to tackle this starting next week 👍

UPDATE: gotta postpone a bit. Up for grabs in the meantime.

@badrishc
Copy link
Contributor

By the looks of it, I may have a time slot to tackle this starting next week 👍

UPDATE: gotta postpone a bit. Up for grabs in the meantime.

Hey @PaulusParssinen - any interest in resuming this work? In general, we would LOVE to see you contribute again to Garnet :).

@s3w3nofficial
Copy link

Hi, I decided to look into this, since i have already done some rdb parsing in #899.

Here is a link to my repo. It contains some basic parsing logic and cli.

Any suggestions on how to approach this and which methods should i expose are welcomed.

@badrishc
Copy link
Contributor

badrishc commented Jan 16, 2025

Ideally, Garnet would have a native --restore-rdb switch that would read the RDB file and use GarnetApi to populate the database on startup without having to use a library to translate. JSON etc as intermediate format is expensive and unnecessary.

@PaulusParssinen
Copy link
Contributor

By the looks of it, I may have a time slot to tackle this starting next week 👍
UPDATE: gotta postpone a bit. Up for grabs in the meantime.

Hey @PaulusParssinen - any interest in resuming this work? In general, we would LOVE to see you contribute again to Garnet :).

Hey! There is interest but severe lack of time unfortunately.. It turns out one can't just ignore your studies indefinitely to focus on open-source contributions and other procrastinating😅

I have thought about picking this up but I'm glad to see @s3w3nofficial has stepped up in my stead, thank you! 🫡 If @s3w3nofficial gets the basic version in, we can always improve and build on top of it and I'll happily help where and when I can.

Quick comment at the current PR. Writing efficient RDB (de)serialization logic for these commands is no trivial task and you really have to know your Span<T>s to minimize the actual allocations/copying happening here. But we can always optimize this later, functionality first right? 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants