The problem

Today I needed to keep an eye on PostgreSQL logs. Luckily, I decided upon installation to log everything using the “csvlog” format. But there’s a small catch, depending how you read that log. This catch is newline characters in database queries.

This has nothing to do with PostgreSQL directly. In fact, it does the right thing, in that it quotes all required fields. Now, a quoted field can contain a newline character. But if you read the file on a line-by-line basis (using methods like file_handle.readline, this will case problems. No matter what programming language you use, if you call readline, it will read up to the next newline character and return that. So, let’s say you have the following CSV record:

2013-03-21 10:41:19.651 CET,"ipbase","ipbase_test",13426,"[local]",514ad5bf.3472,139,"SELECT",2013-03-21 10:41:19 CET,2/5828,3741,LOG,00000,"duration: 0.404 ms statement: SELECT\n p2.device,\n p2.scope,\n p2.label,\n p2.direction\n FROM port p1\n INNER JOIN port p2 USING (link)\n WHERE p1.device='E'\n AND p1.scope='provisioned'\n AND p1.label='Eg'\n AND (p1.device = p2.device\n AND p1.scope = p2.scope\n AND p1.label=p2.label) = false",,,,,,,,,""

If you read this naïvely with “readline” calls, you will get the following:

1:2013-03-21 10:41:19.651 CET,"ipbase","ipbase_test",13426,"[local]",514ad5bf.3472,139,"SELECT",2013-03-21 10:41:19 CET,2/5828,3741,LOG,00000,"duration: 0.404 ms statement: SELECT
2: p2.device,
3: p2.scope,
4: p2.label,
5: p2.direction
6: FROM port p1
7: INNER JOIN port p2 USING (link)
8: WHERE p1.device='E'
9: AND p1.scope='provisioned'
10: AND p1.label='Eg'
11: AND (p1.device = p2.device
12: AND p1.scope = p2.scope
13: AND p1.label=p2.label) = false",,,,,,,,,""

Now, this is really annoying if you want to parse the file properly.

The solution

Read the file byte-by-byte, and feed a line to the CSV parser only if you hit a newline outside of quoted text. Obviously you should consider the newline style (\n, \r or \r\n) and the proper quote and escape characters when doing this.

What about Python?

It turns out, Python’s csv module suffers from this problem. The builtin CSV module reads files line-by-line. However, it is possible to override the default behavior.

For my own purpose, I wrote a simple script, reading from the postgres log until interrupted.

You are free to use this for your own purpose, modify or extend it as you like.

You can find it here: exhuma/postgresql-logmon

Formatting PostgreSQL CSV logs

The problem

The solution

What about Python?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112