quick peak at postgres post install on ubuntu

Leave a comment

I’m checking out how postgres works.

I installed it to 32-bit ubuntu version 12.04. Googling lead to multiple places for the appropriate apt-get commands.

The install created a user, postgres, that runs the postgres binaries. However, root owns the binary files.

A default instance got created. I haven’t learned yet if people often run multiple postgres instances per host.

The following processes run with the default installation:
postgres: writer process
postgres: WAL process
postgres: autovacuum launcher
postgres: stats collector process

The first process is the main postgres process, and it was launched with the -D parameter pointing to a specific directory and the -c parameter pointing to the full path of the postgres.conf file.

The writer process I’m surmising must write to data files, and the WAL process I’ve read elsewhere is the Write Ahead Log, similar to redo log writer in Oracle. Autovacuum launcher governs the ability to automatically run the VACUUM command, which is needed in Postgres periodically. And I’m sure the stats collector updates query optimization stats, but I’ll have to check.

There’s a command psql that is the equal of sqlplus. I’ll explore psql in a follow up post.

Documentation for postgres can be found at postgresql.org.

Having worked as a SQL Server and Oracle DBA, keeping track of database storage is important. Documentation for those two products describes early on how each system places all objects into datafiles. A datafile can contain tables, indexes, stored procedures, views and everything else.

Postgres on the other hand relegates discussion of physical storage to a location fairly deep in the documentation. Each table gets its own datafile. A master directory tree contains all the object in the postgres database, with most objects getting their own separate file. And postgres dictates the directory structure, although perhaps in more advanced deployments users can control some aspects. The filenames have a number which is automatically generated by postgres. My instance installed to /var/lib/postgresql/9.1/main. There are multiple sub-directories below that.

Done writing for now, but I’m going to create some tables, bang around with psql and try out the gui admin tool pgadmin III.

learning postgres

Leave a comment

I work at EMC in the Backup and Recovery Services (BRS) division, and we use postgres. It powers our backup software catalog for Avamar. We use it as a database repository for Data Protection Advisor (DPA). And it was the first database to be virtualized automatically by VMware’s vFabric Data Director.

In the Big Data landscape, postgres pops up all the time. Greenplum uses it. Netezza uses it. Hadapt, a newcomer to the big data space uses it. I think maybe Platfora uses it but by this point my head is spinning and I can’t even remember where I read that. And Cloudera uses it to store management data.

And probably a bizillion other peices of software use it.

I’m interested in EMC and in big data so I’m going to start learning postgres.

I’ll finish up with the “What about mysql?” question. In general, I’d always read that mysql is easy to learn, fast by default and deployed widely for small web apps. And that postgres is slower, but more reliable and feature rich. Some recent browsing reminded me that MySQL has corporate backing, first from its original corporate owners, then from Sun and now from Oracle, which currently owns it. And Postgres remains 100% open source. Finally, mysql allows users to pick their storage engine while Postgres provides for just one.

mysql vs postgres links for reference:
One on wikivs.com
A stackoverflow question with responses
An ancient databasejournal.com article still getting traffic
A blog posting by Chris Travers