Hosting Editoria as a Service

Dears, over the past few weeks we have been working on developing a mechanism to provide Editoria (and all of it’s flavors) as a Service on demand, similar to how we provide other platforms. As you already know, there are pieces of documentation for Editoria in need of further updates, which generated delays on our part in terms of implementation. To add more to it, more time spend on R&D also contributed having a process that took us longer than expected, especially while testing out every approach we could think of, to make it more scalable.

There proposed setup is as follows:

  • Nginx
  • Editoria
  • Editoria dependencies
  • De-dockerized version of postgresql

The reason we are adamant on using non dockerized version of postgres, is because this way we can provide more accurate backups and security patches. While docker makes sense in a development environment, we don’t recommend it in a production environment, as it also adds unnecessary computations, increasing the load time.
We’ve noted a couple points that might be considered to be an issue regarding the deployment of Editoria on a bigger scale, which is the ultimate goal:

  • Editoria’s dependencies all rely on dockerized version of postgres, whereas we need to use the non-dockerized version of it. This prevents users from uploading .docx, export pdf and so on;
  • We are unclear if there is a way to serve Editoria with NODE_ENV=production, which would help bring the load times down by a large margin.

How we propose to proceed:
We are working on patching it up for now, however an approach where the services that Editoria needs are not hard-binded to Docker’s postgresql container is the best possible solution to this. With that said, @kominoshja and myself will talk to @alexgeorg about this and will update you. The idea is to review the proposed setup together and see if there’s anything that can be improved from our side in terms of the approach we are proposing.
I’ll will update you about the progress.
Thank you,

R.S

1 Like

Hi Redon!

Postgres as a docker container provided by Editoria should only be used for development and testing. When running Editoria in production, the app should just accept a connection to a pg db that the deployer provides. It is really up to them (you :slight_smile: ) to decide whether that will be dockerized or not. Same goes for nginx: It is pretty much outside the application’s scope. So it should neither provide, nor block use of nginx or any other similar technology.

The above should currently be possible through node-config’s custom environment variables file.

Does your issue with docker extend to other services beyond postgres?

As far as serving the application with NODE_ENV=production, that should also already be possible. Did you try using the production.js file in the config folder and it didn’t work?

Ideally, the whole thing would come prepackaged in a production-ready docker image (without pg) with its dependencies, that you would then need to connect to a db, and orchestrate as you wish. But that also requires development time.

Contributions and suggestions to improve docs are always more than welcome.

Just my two cents, though it’s up to @alexgeorg to make final decisions on Editoria.
(as well as definitely answer all those “shoulds” above)

:v:

1 Like

Sorry I disagree here with both of you :stuck_out_tongue:

As @yannis mentioned, you do not have to run postgres in docker, there is no obligation there. It is up to you, having everything dockerized makes it easy for dev and a quick start.

What do you mean by preventing users from uploading .docx, export pdf and so on? How a dockerized postgres is responsible of that?

You can :wink:

I do not understand whate makes it hard-binded? Can you elaborate on that. You just need to pass on the right conf with your postgres host, credentials and all to editoria and thats it, docker or not docker.

It is everything is configurable through env variables

I would not say ideally there :). Well it depends on what you mean by prepackaged with its dependencies. In docker 1 process = 1 container :slight_smile:

My ideal world (we are not too far from it):

  1. everything is dockerized, we disagree there but our visions are not incompatible
  2. Split all processes in its own container (our main issue today as we already discussed with @alexgeorg )
  3. Configure everything through env variables, frontend and backend. Everything should configured at runtime, not at build time

As we already mentioned with Pierre, the main issue today is editoria needs 6 processes that are are tightly tied together as they need to share the filesystem. That makes it not scalable and quite heavy to deploy multiple instances as each editoria needs its own export job for instance.

As @yannis puts it, the PostgreSQL in the docker container is a dev solution, not meant for production. There are existing production deployments of PubSweet apps that use various PostgreSQL databases (e.g. AWS RDS, or custom in-house deployments, etc.), so that should work.

The setup you describe is standard and definitely makes sense to me!

It would help if you would describe what kind of issues you faced using this setup, I’m specifically interested in potential issues with the job-xsweet component (this one does run in a Docker container, for ease of managing and containing its dependencies). I’m sure there’s a lot of stuff that needs to be ironed out and documented for production environments, but I’m also confident that it’s mostly small adjustments. Let’s figure this stuff out! :slight_smile:

@unteem, I don’t think we disagree. I just didn’t explain what I meant clearly. :slight_smile:

So, all I meant by pre-packaged is that you would get one or more containers, instead of grabbing the code, installing dependencies etc. I totally agree that all the separate processes should be isolated and independent containers on their own right (eg. the app proper, the various exporting / transforming services etc). Then you’d be able to orchestrate each as needed.

I know you and @alexgeorg are making steps towards that direction, so the system’s heaviness on the operations side should improve in upcoming versions.

1 Like

Really happy the discussion has started in this communication channel as well. I was wondering if we could make a call next week since this is something that involves people that has been involved in the project in different capacities. Wednesday or Friday would be ideal for me.

R.S

@alexgeorg - can you give an update on the work you are going that @unteem and @yannis referenced?

Hello there! I am super excited that @redonski started this conversation as well as with the participation from all of you!

So, my opinion regarding the deployments of Editoria aligns perfectly with @unteem’s points. This means:

  • For the production there is no need for a postgres container. That was never the case btw.The db container is only there to facilitate the development and nothing else.
  • As @yannis and @unteem said, everything (connection to db, configuration of the app, communication with db and services) is configurable and doable via env variables.
  • The main pain point of Editoria is the sharing of the file system across its services and this will change in the near future. To this end for the case of pdf export a reusable microservice will be implemented (probably a message queue @jure will be also needed here as this microservice will do extensive work and thus should be decoupled and operate asynchronously as well as shared between different instances).
  • Regarding docker containers in general (for all the services that use pubsweet’s job queue and any other future ones) that is the preferred way to go in order to achieve scalability and availability.
  • I agree with nginx that @redonski mentioned for the load balancing and we already make use of that in the majority of our deployments.
  • Finally in order to resolve our issue of sharing the file system across the Editoria’s services a bit of refactor is needed in some of the services as well as the introduction of s3 as the file hosting solution.
  • Let’s keep the conversation going and I will also keep you all updated for any developments regarding all of the above. Thank you all for the good points :slight_smile:

PS: @adam I am still studying/thinking about the microservices approach that I’ve mentioned above and also making some refactoring which will eventually put us in a better position when it comes to application’s structure and leaner docker containers (separating server from client)

1 Like

Please inform me for your availability and Boris’s for next week as this week is a bit too complicated for me.

This call should be open for everyone… @alexgeorg / @redonski when you find a time for the call can you put the link here plus a time and date? (don’t forget to list timezone :slight_smile: )

sure we will do that

Hello,
Regarding the call, @redonski & me are available on the following dates next week:

  • 5th of February, 17:00 - 18:00 (CET)
  • 7th of February, 15:00 - 16:00 (CET)
  • 7th of February, 16:00 - 17:00 (CET)
  • 7th of February, 17:00 - 18:00 (CET)

Looking forward to discuss this!

Option 2 or 3 works for me

Thanks Alex. What about the others? I would propose to use Jitsi, but if there are other preferences please do propose :slight_smile:.

R.S

Ok guys, let’s do it 7th of February, 16:00 - 17:00 (CET). I will share a link to the Jitsi channel soon.

R.S

Hey folks!

Let’s meet here to talk: https://meet.jit.si/Editoria-Is-Awesome

See you at 16:00!

Hi again folks!

I want to start by thanking you for joining the call. A lot of technical questions we had are clear now and will def help us move faster.

The main issue we had was that Editoria’s dependencies were waiting for the dockerized version of postgresql to be running, as the environment was set to development. This is not the case when Editoria runs under the production mode. @alexgeorg will provide here some brief instructions on how to run it on production mode, which we will test and then add to the deploy documentation under the editoria-vanilla.

Once we have a scalable solution I will spend some time to write documentation on deployment that will make it easier for anyone to have their own installation.

@unteem @pierreozoux Let’s have a call next Friday same time!

Hey @kominoshja take a look here https://gitlab.coko.foundation/editoria/editoria-vanilla/tree/docker-prod the docker-compose.production.yml file in order to discuss further.

2 Likes

Hey @alexgeorg! Great job!

We’re now able to have the services running with the native postgresql. However there’s still some things that need to be done for production mode. Mainly, assets get deleted and not all dependencies are installed

Can we have a call on Friday at 16:30 CET?

1 Like

Hi folks!

Some news I’m excited to share. We went ahead and installed Editoria under https://editoria-vanilla.cloud68.co using the docker-prod branch created by @alexgeorg

There’s a few things to be ironed out, regarding the production mode such as precomiling the assets and installing yarn dependencies when installing with NODE_ENV=production

After this is solved, we think Editoria is ready to be distributed under the initial production mode for hosters.

Reminder: @alexgeorg are you still up for todays’ call at 16:30 CET?

1 Like