Encrypting a directory in Linux

Safety is something we developers care about. In this modern life our most valuable information is in our electronic devices, and if you are reading this you probably use Linux. Then yes, you found the proper post to keep your secrets safe from the bad guys.

Supposing you use Ubuntu I’ll explain how to encrypt the folder where you keep stuff you don’t want anyone to have access to – for example if someone steals your laptop.

For this task we’ll use eCrypfs. It’s a stacked filesystem for Linux. It can be mounted in a single directory and it does not not require a separate partition.

The mechanism of encryption will be based in mounting the folder using eCryptfs. Once the directory has been mounted with the tool you can manage it as if it was an standard folder. When you finish your work and you want to keep the files inaccessible you need to unmount the directory. If you want to keep using the files you need to mount the folder again.

Preparation steps:

Install eCryptfs

sudo apt-get install ecryptfs-utils

Create the required folders and change their permissions

mkdir ~/.private ~/private
chmod 0700 ~/.private ~/private

Initialize the folder mounting

Initialize eCryptfs (1 1 n n yes yes) (Grab ecryptfs_sig and remember your passphrase)

mount -t ecryptfs ~/.private ~/private
...
WARNING: Based on the contents of [/root/.ecryptfs/sig-cache.txt],
it looks like you have never mounted with this key 
before. This could mean that you have typed your 
passphrase wrong.

Would you like to proceed with the mount (yes/no)? : yes


Would you like to append sig [2f5efa91218fe4d3] to
[/root/.ecryptfs/sig-cache.txt] 
in order to avoid this warning in the future (yes/no)? : yes
Successfully appended new sig to user sig cache file
Mounted eCryptfs

Once the folder has been mounted you can add your files.

The file /root/.ecryptfsrc that saves your preferences will be automatically created. It should look like the image shown below. Check that no passphrase location is in the file, if you see it delete the line:

If you need the ecryptfs_sig it is located in: 

root/.ecryptfs/sig-cache.txt

Unmount

Now when you want to unmount your folder, so that nobody can access it:

sudo umount ~/private

Get your UID

id -u

Append one entry to /etc/fstab (use your UID and SIG obtained in previous steps)

# eCryptfs $HOME/.private mounted to $HOME/private
 /home/foo/.private /home/foo/private ecryptfs rw,noauto,nofail,uid=1000,umask=0077,relatime,ecryptfs_sig=2f5efa91218fe4d3,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs,ecryptfs_passthrough=no,ecryptfs_enable_filename_crypto=no 0 0

Ready to use

Remount

And to remount, so that you can read the data again:

sudo mount -t ecryptfs ~/private

You’ll be asked to insert your passphrase every time you want to mount your folder. I hope you chose a safe one.

If you type wrong the mount passphrase then you need to unmount the folder in order to be able to mount it correctly again.

sudo mount -t ecryptfs ~/private/ (wrong passphrase)
sudo umount ~/private/
sudo mount -t ecryptfs ~/private/ (right passphrase)

And this is it.

In conclusion, if you are like me and has villain enemies all around the globe it’s worth the 10 minute setup. Your data will be in a safer place and you’ll have a better sleep.

You’re welcome.

Useful links:

http://ecryptfs.org/about.html

https://wiki.archlinux.org/index.php/ECryptfs

By Diego Borchers – DevOps Engineer

From AngularJS to Vue without dying in the attempt

You should know by now that here at Geoblink we work with a humongous amount of data, most of which has to be beautifully displayed across the whole app. That means that, besides needing a good performance on our backend to serve that data, we need a good performance on the browser side to process it and render it to our users in the fastest way possible.

Until now, and still for months to come, we’ve been relying on AngularJS as our frontend framework of choice. Needless to say, it’s been a 3 years relationship and generally speaking, we’re really happy with it. It’s a robust system, incredibly well supported even though there are 3 new versions ahead of it and easy to work with.

However, with apps growing bigger and becoming each time more and more common to do most of the calculations on the browser, we started to worry about the app performance had we decided to continue with AngularJS forever. Taking that into account, we started looking for other possibilities. Considering the size of our application, we didn’t even consider migrating to the newest versions of Angular, since the breaking changes were so big (completely different framework really) that we would’ve needed months to make the change. Of course we considered React, having that huge ecosystem and with great state management tools like redux, it was something to bear in mind. Yet we were reluctant, since we didn’t think combining both technologies would really be successful. We needed something that we could start working with progressively, instead of building a new application from the ground up. And then we ‘viewed’ VueJS (pun intended).

We had the chance of working with Vue in one of our so called Geothons, and we fell in love IMMEDIATELY. The performance improvements were astonishing, and the development speed increased incredibly fast, even though none us had previously worked with it. Plus, the progressive part of the framework was just what we needed. Start developing with this framework while we kept maintaining the main parts of our app with AngularJS. It was a no brainer.

Download Angular video

Download Vue video

The first test came quick. We had to completely redo a whole section of the app and we decided to start implementing Vue within Geoblink. But stop with the mumbo jumbo you’d say. How do you use two frameworks at the same time without interfering with one another? Really, it couldn’t be more simple, and it’s all thanks to an already included Angular directive.

<div id=”main-vue-app” ng-non-bindable>
<first-vue-component />
<some-other-vue-component />
</div>

That’s it? I told you it was simple. When attaching this directive to any HTML element, AngularJS stops watching whatever may be inside it. Basically it tells the framework to not consider anything inside as angular code. What happens inside a non-bindable element, stays within it.

Having this, you can start developing your new VueJS app as you’d normally do. Declare your main app, attach your components, your stores if you’re using Vuex (and you certainly should do it) and that’s it, you have a full working VueJS app within an AngularJS application.

// Require vue components files
import  firstVueComponent from ‘path-to-component’
import someOtherComponent from ‘path-to-other-component’


new Vue({ 
    el: '#main-vue-app',
    store: mainStore,
    components: {
      firstVueComponent,
      someOtherComponent    
    }
  })

As simple as it may seem, this is really all you need. Of course, this can become way more complicated, but if you are, as we were, thinking of doing a migration for your own app, I can’t recommend enough to consider this framework. So far it’s been great, and we’ve just started! We can’t wait to keep implementing new functionalities with this amazing framework.

Happy coding!

What came and what’s to come at Geoblink Tech

Happy new year! In these first days of 2018 here at Geoblink we have done a quick look back on what were the technologies that got us more excited during 2017. This includes technologies that some of us had to learn to work with our existing systems, the ones that we played around with just for fun and others that were new to us and were cool enough to end up in our production systems.

Not only that but we also compared that list against the technologies that each of us is looking forward to learn or work with in 2018. We hope you find the list interesting, and if you want to comment on it let us know in Twitter (@geoblinkTech).

Read more

2 days of fun and data BigDataSpain 2017

On Thursday and Friday last week a few geoblinkers from the Tech team were fortunate enough to attend Big Data Spain in Madrid, “one of the three largest conferences in Europe about Big Data”.

The line-up of speakers this year was amazing and they certainly didn’t disappoint. Moreover, our VP of Technology Miguel Ángel Fajardo and our Lead Data Scientist Daniel Domínguez had the chance to actively participate as speakers with a thought-provoking talk titled “Relational is the new Big Data”, where we tried to remark how relational databases can today solve many use cases regardless of the size of your dataset, adding lots of benefits with respect to other No-SQL options.

Relational is the new Big Data

Read more

Happy GIS Day!

From Geoblink we want to celebrate the GIS Day by sharing with you our last improvement in the catchment areas computation.

As Carlos explained in this previous post, at Geoblink we take advantage of the benefits of the graph theory for computing the catchment area of a location. Thus, we define our graph as a set of intersections (nodes) connected by street/road segments (links), and we add to both, nodes and links, some properties that define them.

The new property added to our links is the traffic peak, which allows us to compute catchment areas considering rush hours. Therefore, we first define a location of interest. Secondly, we apply our set of algorithms to our graph for computing the catchment area of the location of interest, specifying if the rush hours must be considered or not. As a result, we obtain a set of intersections and streets/roads segments that make up the catchment area of the location of interest with or without considering traffic peaks.

But, how does it work? Let’s compute the Geoblink’s catchment area traveling 5 minutes by car, not considering the rush hours and considering them.

5min-car

5min-car-peak

As you can see, the resulting catchment area taking in consideration peak traffic is smaller than the other one. The addition of the traffic peak property to our street/road segments allows us to provide to our clients a more accurate catchment area when rush hours must be considered.

Learning from nothing or almost

Last article

In my first article for this blog,  I talked about how my teachers and a team of students I joined used the latest Deep Learning (DL) technologies to help fight cancer.
The goal was to segment and colorize areas in a scanner image corresponding to biological tissues, which could be used to estimate the health of the patient and, in turn, to a better formulation of his treatment.

 

Not colored scan

Colorization of a scanner image

Colorization of a scanner image

Colorization of a scanner image

 

 

 

 

 

 

 

 

 

At that time, in order to focus on how technology can improve medicine,
I had to skip a crucial component of this IODA project, namely pre-training of the auto-encoders. By presenting it here, I’ll try to illustrate a bit some problematics of deep learning.

So, first, let’s do a quick reminder on deep learning.

Read more

LATERAL, a hidden gem in PostgreSQL

LATERAL is a very useful tool that every SQL user should have in his toolbelt. However, it is normally not explained in introductory courses and many get to miss it.

Suppose that we want to find the perfect match between two tables. For example, we could have a table with all students, a table with all schools, and we would like to find the school that is closest to each student. Another example, we have a table of users with some preferences and another table with products, and we want to find the product that is most similar to what they want. Then LATERAL is a lot of times the right tool for you.

Let’s try out to solve the problem with mock data.

Read more

JS meets SQL. Say hi to AlaSQL!

At Geoblink, we often find ourselves moving a lot of data from our database to the frontend, to display in useful charts and graphs for our users. Although restructuring data from the backend is usually necessary, it is not especially challenging, as the queries to the database are crafted to return the data how we need it for each purpose. For these cases, Lodash is more than enough to filter and fit the data to needs of the front-end.  But, what happens when we do not query the data ourselves? Sometimes, when using third-party APIs, the data may not be structured how we want it. We need a way to organize this data quickly and easily.

Cue in AlaSQL. AlaSQL gives us the power of SQL queries in Javascript, in a fast, in-memory SQL database. With this library, we can store records in a relational table format, and query the data using SQL syntax. This is extremely useful for counting or grouping a large amount of records in a flash, without having to overthink how to manipulate the JSONs to achieve the required structure.

Read more

Automating data pipelines with jenkins

One of the cool things about being a Data Scientist at Geoblink is that we get to work on all stages of the data science workflow and touch a very diverse stack of technologies. As part of our daily tasks we gather data from a range of sources, clean it and load it into our database; run and validate machine learning models; and work closely with our DevOps/Infrastructure team to maintain our databases.

As it happens in other start-ups, as we grow rapidly it becomes more and more important to automate routine (and indeed boring) tasks, which take away precious development time from our core developers, but also from us data scientists.

While automation tools have long been used in software development teams, the increasing complexity of data science cycles has made clear the need for workflow management tools that automate these processes. No surprise then that both Spotify and AirBnB have built (and even better, open-sourced!) internal tools with that aim: Luigi and Airflow.

As part of our effort to iterate faster and promptly deliver our client requests, in the last couple of weeks I’ve spent some time working with the great automation tool we use, Jenkins, and in this post I’d like to give you a taste of how we use it in the Geoblink’s Data team.

Read more

Using Deep Learning to heal people suffering from cancer

DL is cool

Sometimes, we are happily using Deep Learning for futiles things like generating faces or changing horses into zebras. But most of the time, it’s a powerful tool that can help saving lives.

At the INSA of Rouen, I worked in a team of student implementing a solution based on an article published by researchers, some of them being my teachers. The article is called IODA: An input/output deep architecture for image labeling and was written by Julien Lerouge, Romain Herault, Clément Chatelain, Fabrice Jardin and Romain Modzelewski. Image labeling is the act of determining zones in an image and saying : ‘this zone corresponds to the sky’ or ‘this zone corresponds to a pedestrian’. But what’s fantastic with their work is that it also does image segmentation (it also detects where are the frontiers of the zones).

Example of image segmentation

Example of image segmentation

Read more