Installing Blazegraph on Linux / Debian server

While Fuseki is a good triple store, Blazegraph is amazing. AWS even adopted it as a product called Amazon Neptune. Blazegraph comes with a workbench UI which is exclusively useful. Today, we'll install Blazegraph on our Linux server.

If you read my other post on how to install Apache Jena Fuseki on Debian 9, this post will look very similar since Blazegraph will be deployed under the same servlet container Tomcat.

Important Note: I've faced an issue where Blazegraph under Tomcat can't handle UTF-8 data. See here. The solution is to use Tomcat 8.5, not Tomcat 7. More links to consider: link link link link

 


We'll need to install Java, Tomcat and a Blazegraph WAR distribution to deploy in it. Next, we'll configure the application to be run as a service, secure it with authentication and expose it as a web page using a reverse proxy.

First, log in to your server and become root.

Step 1: Install Java if you don't have it yet.

apt-get install openjdk-8-jdk
java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-8u212-b01-1~deb9u1-b01)
OpenJDK 64-Bit Server VM (build 25.212-b01, mixed mode)

Step 2: Install Tomcat

The following commands will create a user for Tomcat, download and install Tomcat 7 and configure the system daemon to handle its process.
groupadd tomcat
mkdir /opt/tomcat
useradd -g tomcat -d /opt/tomcat -s /bin/nologin tomcat

mkdir ~/tmp
cd tmp
wget [link to the Tomcat 7.0.90 tar.gz file]
tar -zxvf apache-tomcat-7.0.90.tar.gz
mv apache-tomcat-7.0.90/* /opt/tomcat
chown -R tomcat:tomcat /opt/tomcat/
Create file /etc/systemd/system/tomcat.service with the following content.
nano /etc/systemd/system/tomcat.service
[Unit]
Description=Apache Tomcat 7
Wants=network.target
After=network.target

[Service]
Type=forking
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64/jre
Environment=CATALINA_PID=/opt/tomcat/temp/tomcat.pid
Environment=CATALINA_HOME=/opt/tomcat
Environment='CATALINA_OPTS=-Xms512M -Xmx1G -Djava.net.preferIPv4Stack=true'
Environment='JAVA_OPTS=-Djava.awt.headless=true'
ExecStart=/opt/tomcat/bin/startup.sh
ExecStop=/opt/tomcat/bin/shutdown.sh
SuccessExitStatus=143
User=tomcat
Group=tomcat
UMask=0007
RestartSec=10
Restart=always

[Install]
WantedBy=multi-user.target
Reload the system daemon, start Tomcat and check its status.
systemctl daemon-reload
systemctl start tomcat
systemctl status tomcat
systemctl enable tomcat
netstat -antup | grep 8080
Now Tomcat should be running at http://localhost:8080. If we run
curl http://localhost:8080
we'll get a bunch of HTML containing the Tomcat homepage.

Step 3: Deploy Blazegraph in Tomcat

Download the WAR distribution of Blazegraph and place it inside /opt/tomcat/webapps
cd /opt/tomcat/webapps/
wget https://master.dl.sourceforge.net/project/bigdata/bigdata/2.1.5/blazegraph.war
Once that is done, the application will be running at http://localhost:8080/blazegraph/, so if we run
curl http://localhost:8080/blazegraph/
we'll get a bunch of HTML containing the Blazegraph homepage.

Step 4: Expose the application

By configuring reverse proxy, we can make a local application be accessible from the outside. Our Blazegraph is running at http://localhost:8080/blazegraph/ and we need to redirect all request at https://our.domain.com/blazegraph to that address
nano /etc/apache2/sites-available/our.domain.com.conf
At the end of the <VirtualHost _default_:443>, put in the following configurations:
...
    ProxyRequests Off
    ProxyPreserveHost On
    <Proxy *>
        Require all granted
    </Proxy>
    RewriteEngine on
    RewriteRule ^/blazegraph$ /blazegraph/ [R]
    ProxyPass /blazegraph/  http://localhost:8080/blazegraph/
    ProxyPassReverse /blazegraph/  http://localhost:8080/blazegraph/
</VirtualHost>

And that's it.

Securing the app

The simplest method for securing Blazegraph (or any application running under Tomcat) is to use Basic authentication. To do so, we must define a user role, a user and a password and enable the authentication protocol.
  • At the start of this file, uncomment the tags <security-constraint> <login-config> and <security-role>
    nano /opt/tomcat/webapps/blazegraph/WEB-INF/web.xml
  • Append the following content to /opt/tomcat/conf/tomcat-users.xml
    nano /opt/tomcat/conf/tomcat-users.xml
      <role rolename="blazegraph"/>
      <user username="sebastian" password="ultimate-password" roles="blazegraph"/>
    
  • Restart tomcat with
    systemctl restart tomcat
Now our application should be secured with a username and password.

Gotchas!

Just a couple of troubles to shoot:
  • In the virtual host configuration, we used RewriteEngine. For this to work we need to enable the Apache2 Rewrite module with
    a2enmod rewrite
    systemctl restart apache2 
  • systemctl restart tomcat
  • If, for some reason, the Blazegraph application couldn't be deployed, take a look at the log file to see what the issue is.
    tail -400 /opt/tomcat/logs/catalina.out
    tail -400 /opt/tomcat/logs/localhost.2019-04-27.log
    If it says something like
    bigdata.jnl PERMISSION DENIED
    then we must create a folder to hold the journal file and point Blazegraph to that file
    mkdir /etc/blazegraph/
    chown tomcat:tomcat /etc/blazegraph/
    then in both of these files change line 12
    nano /opt/tomcat/webapps/blazegraph/WEB-INF/GraphStore.properties
    nano /opt/tomcat/webapps/blazegraph/WEB-INF/RWStore.properties
    
    com.bigdata.journal.AbstractJournal.file=/etc/blazegraph/bigdata.jnl
    and then in this file change line 42
    nano /opt/tomcat/webapps/blazegraph/WEB-INF/web.xml
    
    <param-value>/opt/tomcat/webapps/blazegraph/WEB-INF/RWStore.properties</param-value>



Comments

  1. Hi, thank you so much. I followed your instructions and fuseki can be accessed successfully through Internet. However there is a problem still confusing me that I can only get results if I excute some simple queries which don't need a long processing time (less than 10mins in chrome exactly) from the backend. I will otherwise get a print of "unable to get response from endpoint" . I have incresed the timeout for fuseki as well reverse proxy, but this doesn't work for me. Do you know bit about this?

    ReplyDelete
    Replies
    1. sry, I post a wrong place. It is about fuseki instead.

      Delete

Post a Comment

Popular Posts