I think this is a pretty niche demand and probably another topic for r/DataHoarder but anyway, here I am.
I created this application to basically have a way to store my WhatsApp messages away from the Google/Meta servers. Or at least not depend so much on Google backup.
Whatsapp has a very limited export functionality, which any user can use through the app’s own interface. Once these messages and media have been exported, you can place them in a folder monitored by ChatVault, send them to an email monitored by ChatVault or upload them via the interface. Once ingested by chatvault, it will record the chat media on disk and save the messages in a database in a structured way. These messages can be accessed in a front end similar to a chat application.
It’s still under development, some things need to be improved (mainly the UI), it’s still far from ideal, it’s true, the way Whatsapp allows us to export messages is quite bad, which makes the entire process of exporting and ingesting it into chatvault quite coupled but it can still be useful for those who want to store their messages independently, just like I wanted.
https://github.com/vitormarcal/chatvault
Edit: add an application interface image
Hello, criticism is certainly welcome!
If you can open an issue on github it will be easier for me to follow, as I may not see these comments.
About the message date, you are right! Until I divulge the project, I was the only user so I didn’t know it could have multiple types of message date formats, then I developed for a specific format (which isn’t even uncommon).
Someone had already warned me about this and so I started to develop something that could format the date message in a rigth way. It’s almost ready, but unfortunately, there are dates that
end up being ambiguous like 01/01/2023 and it’s not possible to infer the correct format, so I’ll probably have to create an environment variable for that, which I really didn’t want.
If you look on github I opened a bug issue for this (although maybe it’s not really a bug but rather an improvement, because it works but with a specific format) and in the Github’s Projects part it’s already under development.
Regarding the duplication of data, I mentioned it in a comment here on this post, but perhaps I should have made it clearer. Anyway, as I said in the post, the project is still far from ideal, despite it working very well for my use case (every import I do is always new messages).
Anyway, I know this is a very important point, so I created a way of deduplication considering the last message in the database as a cutoff parameter. This is already in the latest version of the docker image.
Regarding docker, more information about the error is welcome, you weren’t the first person to talk about it but I couldn’t replicate the problem. I tested it on Fedora and the Ubuntu server, I built it locally, I pulled it from the Registry, did docker system prune --volumes -a, and it still worked as expected.