PostgreSQL La base de donnees la plus sophistiquee au monde.

La planete francophone de PostgreSQL

mercredi 27 avril 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 24 avril 2016

Session PostgreSQL le 22 septembre 2016 à Lyon (France). La date limite de candidature est le 20 mai : envoyez vos propositions à call-for-paper AT postgresql-sessions DOT org.

Le PostgresOpen 2016 aura lieu à Dallas (Texas, USA) du 13 au 16 septembre. L'appel à conférenciers à été lancé : https://2016.postgresopen.org/callforpapers/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en avril

PostgreSQL Local

  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016 : https://2016.foss4g-na.org/
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016 : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. Les inscriptions sont ouvertes : http://5432meet.us/
  • Le PG Day UK aura lieu le 5 juillet 2016 : http://www.pgconf.uk/
  • Session PostgreSQL le 22 septembre 2016 à Lyon (France). La date limite de candidature est le 20 mai : envoyez vos propositions à call-for-paper AT postgresql-sessions DOT org.
  • La PgConf Silicon Valley 2016 aura lieu du 14 au 16 novembre 2016 : http://www.pgconfsv.com/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160424215058.GA9189@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Peter Eisentraut pushed:

Fujii Masao pushed:

Tom Lane pushed:

  • Further reduce the number of semaphores used under --disable-spinlocks. Per discussion, there doesn't seem to be much value in having NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are running more than a few backends concurrently, you really had better have a real spinlock implementation if you want tolerable performance. And 1024 semaphores is a sizable fraction of the system-wide SysV semaphore limit on many platforms. Therefore, reduce this setting's default value to 128 to make it less likely to cause out-of-semaphores problems. http://git.postgresql.org/pg/commitdiff/75c24d0f7491f77dfbc0acdf6c18439f288353ef
  • Make partition-lock-release coding more transparent in BufferAlloc(). Coverity complained that oldPartitionLock was possibly dereferenced after having been set to NULL. That actually can't happen, because we'd only use it if (oldFlags & BM_TAG_VALID) is true. But nonetheless Coverity is justified in complaining, because at line 1275 we actually overwrite oldFlags, and then still expect its BM_TAG_VALID bit to be a safe guide to whether to release the oldPartitionLock. Thus, the code would be incorrect if someone else had changed the buffer's BM_TAG_VALID flag meanwhile. That should not happen, since we hold pin on the buffer throughout this sequence, but it's starting to look like a rather shaky chain of logic. And there's no need for such assumptions, because we can simply replace the (oldFlags & BM_TAG_VALID) tests with (oldPartitionLock != NULL), which has identical results and makes it plain to all comers that we don't dereference a null pointer. A small side benefit is that the range of liveness of oldFlags is greatly reduced, possibly allowing the compiler to save a register. This is just cleanup, not an actual bug fix, so there seems no need for a back-patch. http://git.postgresql.org/pg/commitdiff/a0382e2d7e330de13e15cea0921a95faa9da3570
  • Improve regression tests for degree-based trigonometric functions. Print the actual value of each function result that's expected to be exact, rather than merely emitting a NULL if it's not right. Although we print these with extra_float_digits = 3, we should not trust that the platform will produce a result visibly different from the expected value if it's off only in the last place; hence, also include comparisons against the exact values as before. This is a bit bulkier and uglier than the previous printout, but it will provide more information and be easier to interpret if there's a test failure. Discussion: <18241.1461073100@sss.pgh.pa.us> http://git.postgresql.org/pg/commitdiff/4db0d2d2fe935e086dfd26c00f707dab298b443c
  • Fix memory leak and other bugs in ginPlaceToPage() & subroutines. Commit 36a35c550ac114ca turned the interface between ginPlaceToPage and its subroutines in gindatapage.c and ginentrypage.c into a royal mess: page-update critical sections were started in one place and finished in another place not even in the same file, and the very same subroutine might return having started a critical section or not. Subsequent patches band-aided over some of the problems with this design by making things even messier. One user-visible resulting problem is memory leaks caused by the need for the subroutines to allocate storage that would survive until ginPlaceToPage calls XLogInsert (as reported by Julien Rouhaud). This would not typically be noticeable during retail index updates. It could be visible in a GIN index build, in the form of memory consumption swelling to several times the commanded maintenance_work_mem. Another rather nasty problem is that in the internal-page-splitting code path, we would clear the child page's GIN_INCOMPLETE_SPLIT flag well before entering the critical section that it's supposed to be cleared in; a failure in between would leave the index in a corrupt state. There were also assorted coding-rule violations with little immediate consequence but possible long-term hazards, such as beginning an XLogInsert sequence before entering a critical section, or calling elog(DEBUG) inside a critical section. To fix, redefine the API between ginPlaceToPage() and its subroutines by splitting the subroutines into two parts. The "beginPlaceToPage" subroutine does what can be done outside a critical section, including full computation of the result pages into temporary storage when we're going to split the target page. The "execPlaceToPage" subroutine is called within a critical section established by ginPlaceToPage(), and it handles the actual page update in the non-split code path. The critical section, as well as the XLOG insertion call sequence, are both now always started and finished in ginPlaceToPage(). Also, make ginPlaceToPage() create and work in a short-lived memory context to eliminate the leakage problem. (Since a short-lived memory context had been getting created in the most common code path in the subroutines, this shouldn't cause any noticeable performance penalty; we're just moving the overhead up one call level.) In passing, fix a bunch of comments that had gone unmaintained throughout all this klugery. Report: <571276DD.5050303@dalibo.com> http://git.postgresql.org/pg/commitdiff/bde361fef5ea3c65074a0c95c724fae5ac8a1bb5
  • Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait. In commit 2ffa86962077c588 we made pg_ctl recognize an environment variable PGCTLTIMEOUT to set the default timeout for starting and stopping the postmaster. However, pg_regress uses pg_ctl only for the "stop" end of that; it has bespoke code for starting the postmaster, and that code has historically had a hard-wired 60-second timeout. Further buildfarm experience says it'd be a good idea if that timeout were also controlled by PGCTLTIMEOUT, so let's make it so. Like the previous patch, back-patch to all active branches. Discussion: <13969.1461191936@sss.pgh.pa.us> http://git.postgresql.org/pg/commitdiff/cbabb70f35bb0e5bac84b9f15ecadc82868ad9f9
  • PGDLLIMPORT-ify old_snapshot_threshold. Revert commit 7cb1db1d9599f0a09d6920d2149d956ef6d88b0e, which represented a misunderstanding of the problem (if snapmgr.h weren't already included in bufmgr.h, things wouldn't compile anywhere). Instead install what I think is the real fix. http://git.postgresql.org/pg/commitdiff/14216649f3dc8bd9839702440dd593e958b0920b
  • Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK. When we shoehorned "x op ANY (array)" into the SQL syntax, we created a fundamental ambiguity as to the proper treatment of a sub-SELECT on the righthand side: perhaps what's meant is to compare x against each row of the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar sub-SELECT that delivers a single array value whose members should be compared against x. The grammar resolves it as the former case whenever the RHS is a select_with_parens, making the latter case hard to reach --- but you can get at it, with tricks such as attaching a no-op cast to the sub-SELECT. Parse analysis would throw away the no-op cast, leaving a parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr. ruleutils.c was not clued in on this fine point, and would naively emit "x op ANY ((SELECT ...))", which would be parsed as the first alternative, typically leading to errors like "operator does not exist: text = text[]" during dump/reload of a view or rule containing such a construct. To fix, emit a no-op cast when dumping such a parsetree. This might well be exactly what the user wrote to get the construct accepted in the first place; and even if she got there with some other dodge, it is a valid representation of the parsetree. Per report from Karl Czajkowski. He mentioned only a case involving RLS policies, but actually the problem is very old, so back-patch to all supported branches. Report: <20160421001832.GB7976@moraine.isi.edu> http://git.postgresql.org/pg/commitdiff/1f7c85b820814810f985a270e92cde4c12ceded4
  • Remove dead code in win32.h. There's no longer a need for the MSVC-version-specific code stanza that forcibly redefines errno code symbols, because since commit 73838b52 we're unconditionally redefining them in the stanza before this one anyway. Now it's merely confusing and ugly, so get rid of it; and improve the comment that explains what's going on here. Although this is just cosmetic, back-patch anyway since I'm intending to back-patch some less-cosmetic changes in this same hunk of code. http://git.postgresql.org/pg/commitdiff/e54528155a3c4159b01327534691c3342a371cab
  • Improve TranslateSocketError() to handle more Windows error codes. The coverage was rather lean for cases that bind() or listen() might return. Add entries for everything that there's a direct equivalent for in the set of Unix errnos that elog.c has heard of. http://git.postgresql.org/pg/commitdiff/125ad539a275db5ab8f4647828b80a16d02eabd2
  • Fix planner failure with full join in RHS of left join. Given a left join containing a full join in its righthand side, with the left join's joinclause referencing only one side of the full join (in a non-strict fashion, so that the full join doesn't get simplified), the planner could fail with "failed to build any N-way joins" or related errors. This happened because the full join was seen as overlapping the left join's RHS, and then recent changes within join_is_legal() caused that function to conclude that the full join couldn't validly be formed. Rather than try to rejigger join_is_legal() yet more to allow this, I think it's better to fix initsplan.c so that the required join order is explicit in the SpecialJoinInfo data structure. The previous coding there essentially ignored full joins, relying on the fact that we don't flatten them in the joinlist data structure to preserve their ordering. That's sufficient to prevent a wrong plan from being formed, but as this example shows, it's not sufficient to ensure that the right plan will be formed. We need to work a bit harder to ensure that the right plan looks sane according to the SpecialJoinInfos. Per bug #14105 from Vojtech Rylko. This was apparently induced by commit 8703059c6 (though now that I've seen it, I wonder whether there are related cases that could have failed before that); so back-patch to all active branches. Unfortunately, that patch also went into 9.0, so this bug is a regression that won't be fixed in that branch. http://git.postgresql.org/pg/commitdiff/80f66a9ad06eafa91ffc5ff19c725c7f393c242e
  • Fix unexpected side-effects of operator_precedence_warning. The implementation of that feature involves injecting nodes into the raw parsetree where explicit parentheses appear. Various places in parse_expr.c that test to see "is this child node of type Foo" need to look through such nodes, else we'll get different behavior when operator_precedence_warning is on than when it is off. Note that we only need to handle this when testing untransformed child nodes, since the AEXPR_PAREN nodes will be gone anyway after transformExprRecurse. Per report from Scott Ribe and additional code-reading. Back-patch to 9.5 where this feature was added. Report: <ED37E303-1B0A-4CD8-8E1E-B9C4C2DD9A17@elevated-dev.com> http://git.postgresql.org/pg/commitdiff/abb164655c703a5013b7fcf83f855a071895dc91
  • Convert contrib/seg's bool-returning SQL functions to V1 call convention. It appears that we can no longer get away with using V0 call convention for bool-returning functions in newer versions of MSVC. The compiler seems to generate code that doesn't clear the higher-order bits of the result register, causing the bool result Datum to often read as "true" when "false" was intended. This is not very surprising, since the function thinks it's returning a bool-width result but fmgr_oldstyle assumes that V0 functions return "char *"; what's surprising is that that hack worked for so long on so many platforms. The only functions of this description in core+contrib are in contrib/seg, which we'd intentionally left mostly in V0 style to serve as a warning canary if V0 call convention breaks. We could imagine hacking things so that they're still V0 (we'd have to redeclare the bool-returning functions as returning some suitably wide integer type, like size_t, at the C level). But on the whole it seems better to convert 'em to V1. We can still leave the pointer- and int-returning functions in V0 style, so that the test coverage isn't gone entirely. Back-patch to 9.5, since our intention is to support VS2015 in 9.5 and later. There's no SQL-level change in the functions' behavior so back-patching should be safe enough. Discussion: <22094.1461273324@sss.pgh.pa.us> Michael Paquier, adjusted some by me http://git.postgresql.org/pg/commitdiff/c8e81afc60093b199a128ccdfbb692ced8e0c9cd
  • Rename strtoi() to strtoint(). NetBSD has seen fit to invent a libc function named strtoi(), which conflicts with the long-established static functions of the same name in datetime.c and ecpg's interval.c. While muttering darkly about intrusions on application namespace, we'll rename our functions to avoid the conflict. Back-patch to all supported branches, since this would affect attempts to build any of them on recent NetBSD. Thomas Munro http://git.postgresql.org/pg/commitdiff/0ab3595e5bb53a8fc2cd231320b1af1ae3ed68e0
  • Improve PostgresNode.pm's logic for detecting already-in-use ports. Buildfarm members bowerbird and jacana have shown intermittent "could not bind IPv4 socket" failures in the BinInstallCheck stage since mid-December, shortly after commits 1caef31d9e550408 and 9821492ee417a591 changed the logic for selecting which port to use in temporary installations. One plausible explanation is that we are randomly selecting ports that are already in use for some non-Postgres purpose. Although the code tried to defend against already-in-use ports, it used pg_isready to probe the port which is quite unhelpful: if some non-Postgres server responds at the given address, pg_isready will generally say "no response", leading to exactly the wrong conclusion about whether the port is free. Instead, let's use a simple TCP connect() call to see if anything answers without making assumptions about what it is. Note that this means there's no direct check for a conflicting Unix socket, but that should be okay because there should be no other Unix sockets in use in the temporary socket directory created for a test run. This is only a partial solution for the TCP case, since if the port number is in use for an outgoing connection rather than a listening socket, we'll fail to detect that. We could try to bind() to the proposed port as a means of detecting that case, but that would introduce its own failure modes, since the system might consider the address to remain reserved for some period of time after we drop the bound socket. Close study of the errors returned by bowerbird and jacana suggests that what we're seeing there may be conflicts with listening not outgoing sockets, so let's try this and see if it improves matters. It's certainly better than what's there now, in any case. Michael Paquier, adjusted by me to work on non-Windows as well as Windows http://git.postgresql.org/pg/commitdiff/fab84c7787f25756a9d7bcb8bc89145d237e8e85

Kevin Grittner pushed:

  • Revert no-op changes to BufferGetPage(). The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas http://git.postgresql.org/pg/commitdiff/a343e223a5c33a7283a6d8b255c9dbc48dbc5061
  • Inline initial comparisons in TestForOldSnapshot(). Even with old_snapshot_threshold = -1 (which disables the "snapshot too old" feature), performance regressions were seen at moderate to high concurrency. For example, a one-socket, four-core system running 200 connections at saturation could see up to a 2.3% regression, with larger regressions possible on NUMA machines. By inlining the early (smaller, faster) tests in the TestForOldSnapshot() function, the i7 case dropped to a 0.2% regression, which could easily just be noise, and is clearly an improvement. Further testing will show whether more is needed. http://git.postgresql.org/pg/commitdiff/11e178d0dc4bc2328ae4759090b3c48b07023fab
  • Include snapmgr.h in blscan.c. Windows builds on buildfarm are failing because old_snapshot_threshold is not found in the bloom filter contrib module. http://git.postgresql.org/pg/commitdiff/7cb1db1d9599f0a09d6920d2149d956ef6d88b0e

Magnus Hagander pushed:

Robert Haas pushed:

  • Forbid parallel Hash Right Join or Hash Full Join. That won't work. You'll get bogus null-extended rows. Mithun Cy http://git.postgresql.org/pg/commitdiff/9c75e1a36b6b2f3ad9f76ae661f42586c92c6f7c
  • Add pg_dump support for the new PARALLEL option for aggregates. This was an oversight in commit 41ea0c23761ca108e2f08f6e3151e3cb1f9652a1. Fabrízio de Royes Mello, per a report from Tushar Ahuja http://git.postgresql.org/pg/commitdiff/b4e0f183826e85fd43248d5047eddf393c3d8a30
  • postgres_fdw: Don't push down certain full joins. If there's a filter condition on either side of a full outer join, it is neither correct to attach it to the join's ON clause nor to throw it into the toplevel WHERE clause. Just don't push down the join in that case. To maximize the number of cases where we can still push down full joins, push inner join conditions into the ON clause at the first opportunity rather than postponing them to the top-level WHERE clause. This produces nicer SQL, anyway. This bug was introduced in e4106b2528727c4b48639c0e12bf2f70a766b910. Ashutosh Bapat, per report from Rajkumar Raghuwanshi. http://git.postgresql.org/pg/commitdiff/5b1f9ce1d9e8dcae2bcd93b2becffaba5e4f3049
  • Allow queries submitted by postgres_fdw to be canceled. This fixes a problem which is not new, but with the advent of direct foreign table modification in 0bf3ae88af330496517722e391e7c975e6bad219, it's somewhat more likely to be annoying than previously. So, arrange for a local query cancelation to propagate to the remote side. Michael Paquier, reviewed by Etsuro Fujita. Original report by Thom Brown. http://git.postgresql.org/pg/commitdiff/f039eaac7131ef2a4cf63a10cf98486f8bcd09d2
  • Fix assorted defects in 09adc9a8c09c9640de05c7023b27fb83c761e91c. That commit increased all shared memory allocations to the next higher multiple of PG_CACHE_LINE_SIZE, but it didn't ensure that allocation started on a cache line boundary. It also failed to remove a couple other pieces of now-useless code. BUFFERALIGN() is perhaps obsolete at this point, and likely should be removed at some point, too, but that seems like it can be left to a future cleanup. Mistakes all pointed out by Andres Freund. The patch is mine, with a few extra assertions which I adopted from his version of this fix. http://git.postgresql.org/pg/commitdiff/9f84280ae94b43b75dcf32aef433545335e7bb16
  • Comment improvements for ForeignPath. It's not necessarily just scanning a base relation any more. Amit Langote and Etsuro Fujita http://git.postgresql.org/pg/commitdiff/36f69faeff540cd93de0b6aa7c2d2a7781d637a6
  • Prevent possible crash reading pg_stat_activity. Also, avoid reading PGPROC's wait_event field twice, once for the wait event and again for the wait_event_type, because the value might change in the middle. Petr Jelinek and Robert Haas http://git.postgresql.org/pg/commitdiff/c4a586c4860477ddae6d4f9cef88486f0e37c37e

Bruce Momjian pushed:

Andres Freund pushed:

  • Fix documentation & config inconsistencies around 428b1d6b2. Several issues: 1) checkpoint_flush_after doc and code disagreed about the default 2) new GUCs were missing from postgresql.conf.sample 3) Outdated source-code comment about bgwriter_flush_after's default 4) Sub-optimal categories assigned to new GUCs 5) Docs suggested backend_flush_after is PGC_SIGHUP, but it's PGC_USERSET. 6) Spell out int as integer in the docs, as done elsewhere Reported-By: Magnus Hagander, Fujii Masao Discussion: CAHGQGwETyTG5VYQQ5C_srwxWX7RXvFcD3dKROhvAWWhoSBdmZw@mail.gmail.com http://git.postgresql.org/pg/commitdiff/8f91d87d43d021db92c6edd966a4bb8c3a81ae39

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Michaël Paquier sent in a patch to ensure that a reserved role is never a member of another role or group.

Kyotaro HORIGUCHI sent in a patch to fix the documentation for synchronous_standby_names.

Kyotaro HORIGUCHI sent in two more revisions of a patch to fix synchronous replication update configuration.

Michaël Paquier sent in another revision of a patch to fix an OOM in libpq and infinite loop with getCopyStart().

Fujii Masao sent in a patch to add error checks to BRIN summarize new values.

Michaël Paquier sent in another revision of a patch to do hot standby checkpoints.

Amit Langote sent in two more revisions of a patch to implement declarative partitioning.

Amit Langote sent in two revisions of a patch to fix some issues in the Bloom documentation.

David Rowley sent in two more revisions of a patch to fix EXPLAIN VERBOSE with parallel aggregate.

Ants Aasma sent in another revision of a patch to update old snapshot map once per tick.

Dmitry Ivanov sent in a patch to fix some of the documentation for the new phrase search capability.

Michaël Paquier sent in a patch to change contrib/seg/ to convert functions to use the V1 declaration.

Juergen Hannappel sent in a patch to add an option to pg_dumpall to exclude tables from the dump.

Andres Freund sent in a patch to keep from opening formally non-existant segments in _mdfd_getseg().

Thomas Munro sent in a patch to implement kqueue for *BSD.

Amit Kapila sent in a patch to fix an old snapshot threshold performance issue.

Noah Misch sent in a patch to add xlc atomics.

Andres Freund sent in a patch to emit invalidations to standby for transactions without xid.

Andrew Dunstan sent in a patch to add transactional enum additions.

Andrew Dunstan sent in a patch to add VS2015 support.

Simon Riggs sent in a patch to fix some suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

par N Bougain le mercredi 27 avril 2016 à 00h17

jeudi 21 avril 2016

Adrien Nayrat

Index BRIN – Performances

La version 9.5 de PostgreSQL sortie en Janvier 2016 propose un nouveau type d’index : les Index BRIN pour Bloc Range INdex. Ces derniers sont recommandés pour les tables volumineuses et corrélées avec leur emplacement. J’ai décidé de consacrer une série d’article sur ces index :

Pour information, je serai présent au PGDay France à Lille le mardi 31 mai pour présenter cet index. Il y aura également plein d’autres conférences intéressantes!

Cet article est la dernier de la série, il sera consacré aux performances (maintenance, lecture, insertion…)

Performances

Les articles précédents ont abordé le fonctionnement et les spécificités des index BRIN. Cet article sera plutôt consacré aux performances. Les exemples précédents portaient sur de petites volumétries. Maintenant nous allons voir ce que peuvent apporter ces index sur une volumétrie plus importante.

Les tests ont été effectués sur un PC portable, les journaux de transaction, la table et les index sont stockés sur un disque mécanique. Les résultats seront différents suivant la matériel utilisé. Ces chiffres sont purement indicatifs et servent surtout à se donner un ordre d’idée.

Exemple

Dans un premier temps il est nécessaire de créer une table avec une volumétrie importante.

Par exemple : un système de mesure avec 100 sondes et une mesure toutes les secondes. Nous allons donc obtenir 100*365*24*3600 mesures => un peu plus de 3 milliards de lignes.

-- Utilisation d'une fonction pour générer du texte aléatoire.
-- Trouvée ici : http://stackoverflow.com/questions/3970795/how-do-you-create-a-random-string-in-postgresql

CREATE OR REPLACE FUNCTION random_string(LENGTH INTEGER) RETURNS text AS 
$$
DECLARE
  chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}';
  RESULT text := '';
  i INTEGER := 0;
BEGIN
  IF LENGTH &lt; 0 THEN
    raise exception 'Given length cannot be less than 0';
  END IF;
  FOR i IN 1..LENGTH loop
    RESULT := RESULT || chars[1+random()*(array_length(chars, 1)-1)];
  END loop;
  RETURN RESULT;
END;
$$ LANGUAGE plpgsql;
-- Creation de la table contenant les sondes 
CREATE TABLE probe (id serial PRIMARY KEY, name text);
INSERT INTO probe (name ) SELECT random_string(5) FROM generate_series(1,100);

CREATE TABLE data AS
WITH generation AS (
SELECT '2015-01-01'::TIMESTAMP + i * INTERVAL '1 second' AS date_metric,sonde::text,random() AS metric
FROM generate_series(0, 3600*24*365) i,
LATERAL (SELECT name FROM probe) sonde)
SELECT * FROM generation;

La table obtenue fait un peu plus de 150Go, elle ne rentre donc pas dans  la RAM de la machine hébergeant l’instance et encore moins dans la mémoire partagée de PostgreSQL.

Maintenance

On crée plein d’index pour comparer leur taille :

CREATE INDEX metro_btree_idx ON data USING btree (date_metric);
CREATE INDEX metro_brin_idx_8 ON data USING brin (date_metric) WITH (pages_per_range = 8);
CREATE INDEX metro_brin_idx_16 ON data USING brin (date_metric) WITH (pages_per_range = 16);
CREATE INDEX metro_brin_idx_32 ON data USING brin (date_metric) WITH (pages_per_range = 32);
CREATE INDEX metro_brin_idx_64 ON data USING brin (date_metric) WITH (pages_per_range = 64);
CREATE INDEX metro_brin_idx_128 ON data USING brin (date_metric);
CREATE INDEX metro_brin_idx_256 ON data USING brin (date_metric) WITH (pages_per_range = 256);

Voici les résultats obtenus pour la durée de création des index et leurs tailles :

size-largesize-large

La création de l’index a été 4 fois plus rapide pour les index BRIN. Il est possible que leur création aurait été plus rapide avec un stockage plus performant.

La taille des index est également frappante,  l’index b-tree fait 66 Go alors que l’index BRIN avec le pages_per_range par défaut fait seulement 5 Mo.

On peut tout suite constater le gain sur l’espace utilisé et la rapidité de création des index. Les opérations de maintenances (REINDEX) seront grandement facilitées.

Performances en lecture

Nous allons effectuer plusieurs tests, l’idée est d’essayer de mettre en valeur les différences de comportements entre les index BRIN et b-tree.

La requête utilisée sera tout simple :

EXPLAIN (ANALYSE,BUFFERS,VERBOSE) SELECT date_metric,sonde,metric FROM DATA WHERE date_metric = 'xxx';

pour obtenir un résultat avec peu de lignes  :

WHERE date_metric = '2015-05-01 00:00:00'::TIMESTAMP

Pour obtenir plus de résultat on prendra un intervalle avec :

WHERE date_metric BETWEEN 'xxx'::TIMESTAMP AND 'xxx'::TIMESTAMP;

Voici les résultats obtenus :

lignes BRIN Btree Gain
Durée Blocs lus Durée Blocs lus Durée Volume données lues
100 24ms 697 0.06ms 7 Btree (x400) Btree (x100)
267 millions 170s 13Go 228s 18Go BRIN (x1.3) BRIN (x1.4)
777 millions 8min 38Go 11min 54Go BRIN (x1.37) BRIN (x1.4)
1.3 milliard 13min 63Go 32min (seqscan)
18min
153 Go (seqscan)
90 Go
BRIN (x2) vs seqscan
BRIN (1.4x) vs Btree
BRIN (x2.4) vs seqscan
BRIN (1.4x) vs Btree

Pour comparer le volume de données lues et la durée d’exécution nous pouvons désactiver les index dans une transaction :

BEGIN;
DROP index ...;
explain (analyse,verbose,buffers) SELECT ...
rollback;

Pour le 1er test, le moteur choisit l’index-btree. En supprimant l’index b-tree il choisit l’index BRIN.

Pour les tests 2 et 3, le moteur choisit l’index BRIN, en supprimant l’index BRIN il choisit l’index b-tree.

Pour le dernier test j’ai rajouté d’autres mesures. En effet, en supprimant l’index BRIN le moteur va effectuer un seqscan (parcours de toute la table). Pour obtenir les mêmes comparaisons que les résultats précédents j’ai donc supprimé l’index BRIN et désactivé les parcours séquentiels (set enable_seqscan to ‘off’;)

Globalement on peut constater un gain de 30-40% dans les cas où beaucoup de résultats sont demandés. Le moteur lit moins de blocs lorsqu’il utilise les index BRIN, l’index b-tree étant volumineux, ses lectures sont coûteuses.

En revanche l’index b-tree s’avère particulièrement performant lorsque la requête est très sélective et que peu de résultats sont retournés. En effet, en utilisant un index BRIN, le moteur commence par lire l’intégralité de l’index. Puis il va lire un ensemble de blocs qui contiennent la valeur recherchée, certains ne contenants aucun résultat. Ces lectures supplémentaires se ressentent sur la durée d’exécution de la requête.

Performances en insertion

Vu que les index BRIN sont plus petits et leur durée de création plus courte, on peut se demander ce qu’il advient du surcoût de cet index lors d’insertion de données. Pour cela on va créer une table et mesurer l’insertion de 10 millions de lignes en fonction des index déjà présents sur la table. Afin de cibler le surcoût dû à la mise à jour des index, la table est non-journalisée, ceci permet d’éviter les écritures dans les journaux de transaction. L’autovacuum est également désactivé.

CREATE UNLOGGED TABLE brin_demo_2 (c1 INT);
INSERT INTO brin_demo_2 SELECT * FROM generate_series(1,10000000);
TRUNCATE brin_demo_2;

CREATE INDEX brin_demo_2_brin_idx ON brin_demo_2 USING brin (c1);
INSERT INTO brin_demo_2 SELECT * FROM generate_series(1,10000000);
DROP INDEX brin_demo_2_brin_idx;
TRUNCATE brin_demo_2;
 
CREATE INDEX brin_demo_2_brin_idx ON brin_demo_2 USING brin (c1) WITH (pages_per_range = 256);
INSERT INTO brin_demo_2 SELECT * FROM generate_series(1,10000000);
DROP INDEX brin_demo_2_brin_idx;
TRUNCATE brin_demo_2;
...

Voici les résultats obtenus :

insertionComme pour les chiffres sur les performances en lecture, ces chiffres ne représentent que les durées d’insertion sur mon matériel. La table, les index et les journaux de transaction sont sur le même disque physique, ce qui ralenti les opérations.

Cependant on peut constater qu’il est moins coûteux d’insérer des données dans une table avec un index BRIN qu’avec un index b-tree. On constate également qu’il n’y a pas d’écart significatif entre les différents types d’index BRIN.

Conclusion

Cette série d’articles a permis de présenter les principes des index BRIN. puis leur fonctionnement à travers des exemples simples.

Ensuite nous avons vu l’importance de la corrélation pour exploiter pleinement ces index. Enfin, nous avons essayé de mesurer le gain que pouvait apporter cet index sur de multiples aspect (maintenance, performance en lecture et insertion).

Décrire le fonctionnement d’un index en simplifiant sa représentation est un exercice compliqué. On peut vite sacrifier le fond à la forme. Présenter des chiffres est également délicat tellement ils peuvent dépendre du contexte. J’ai fait l’effort de détailler comment je les ai obtenu afin que chacun puisse reproduire ses propres tests. L’idée est de donner un aperçu des cas d’utilisation de ce type d’index.

Globalement il faut retenir que les index BRIN sont utiles pour les tables volumineuses et où la corrélation avec l’emplacement des données est importante. Ils seront plus lents que les index b-tree lorsque la recherche nécessite de parcourir peu de blocs. Ils seront un peu plus rapide que les index b-tree dans les situations où le moteur doit lire beaucoup de blocs (moins de blocs à lire dans l’index).

L’étude de cet index ouvre d’autres pistes de réflexion. Comme la prise en compte de la corrélation dans le calcul du coût. J’avais également pensé à la possibilité d’utiliser un index pour créer un autre index.

Dans l’exemple avec la table volumineuse (150Go). Si on souhaite créer un index partiel sur le mois précédent, le moteur va parcourir l’intégralité de la table pour créer d’index. On pourrait envisager créer l’index b-tree en utilisant l’index BRIN pour ne parcourir que les lignes correspondant au moins précédent.

FacebookTwitterGoogle+ViadeoPrintEmailGoogle GmailLinkedInPartager

par Adrien Nayrat le jeudi 21 avril 2016 à 20h58

Nicolas Gollet

Bug PostgreSQL 9.4 et 9.5 sous Windows?

Certains d'entre nous rencontrent un bug lors de l'installation de PostgreSQL 9.4 ou 9.5 pour Windows avec le Package d'installation fourni par EntrepriseDB.

L'installateur échoue avec le message1 :

Problem running post-install step, Installation may not complete correctly The database cluster initialisation failed.

En regardant dans les logs d'installation on peut voir :

initializing dependencies ... child process was terminated by exception 0xC000001D

Le code erreur 0xC000001D signifie:
STATUS_ILLEGAL_INSTRUCTION = {EXCEPTION} Illegal Instruction An attempt was made to execute an illegal instruction.

En d'autres termes, l'utilisation d'une instruction non supportée par le système.

Ce problème se produit lorsqu'un système n'expose pas correctement les extensions supportées par le processeur les "CPUID2".

Par exemple le CPU expose le flag AVX2 sans exposer les flags XSAVE et OSXSAVE, rendant ainsi le support des "Advanced Vector Extensions" partiel. Ce support partiel n'est pas détecté par le Runtime Visual Studio 2013, faisant ainsi planter l'application lors de l'utilisation de ces instructions. Ici lors de l'installation, c'est le processus "postgres.exe" qui plante...

On constate ce problème lorsque l'hyperviseur hébergeant une machine virtuelle est bogué, par exemple avec Citrix XenServer 6.5. (On peut aussi avoir un comportement similaire sur certain poste de travail avec un Bios bogué)

On peut facilement détecter sur système que tous les flags sont correctement exposés, pour effectuer ce contrôle, on peut exécuter ce petit programme disponible sur mon GitHub.

Concernant XenServer 6.5, le service pack 1 (SP1) corrige l'exposition des flags sur les machines virtuelles en désactivant l'AVX même si celui-ci est disponible sur le système.

Ceci n'est donc pas vraiment un bug PostgreSQL mais plutôt un bug de l'hyperviseur (/ machine) qui expose mal le support du CPU mais aussi de la runtime Visual Studio C++ 2013 qui n'est pas capable de détecter un support partiel....

Ceci nous prouve bien que la virtualisation n'est pas complètement transparente... et de l'importance de bien mettre à jour les hyperviseurs!

Un bug similaire a été détecté et corrigé dans les dernières versions mineur de PostgreSQL, lorsque l'on utilise un système d'exploitation Windows ne supportant pas l'AVX avec des processeur le supportant : Plus d'info ici


  1. Problem running post-install step, Installation may not complete correctly The database cluster initialisation failed.

  2. https://en.wikipedia.org/wiki/CPUID

par Nicolas GOLLET le jeudi 21 avril 2016 à 13h03

mercredi 20 avril 2016

Adrien Nayrat

Index BRIN – Corrélation

La version 9.5 de PostgreSQL sortie en Janvier 2016 propose un nouveau type d’index : les Index BRIN pour Bloc Range INdex. Ces derniers sont recommandés pour les tables volumineuses et corrélées avec leur emplacement. J’ai décidé de consacrer une série d’article sur ces index :

Pour information, je serai présent au PGDay France à Lille le mardi 31 mai pour présenter cet index. Il y aura également plein d’autres conférences intéressantes!

Dans ce troisième article nous verrons pourquoi la corrélation des données avec leur emplacement est importante pour les index BRIN.

Données corrélées

Ces premiers exemples étaient volontairement simples voire simplistes afin de faciliter la compréhension. Les enregistrements avaient une particularité importante : les valeurs étaient croissantes.

Ce qui signifie qu’il y a une forte corrélation entre les enregistrements et leur emplacement.

Pour information, le moteur stocke des statistiques sur les tables afin de choisir le meilleur plan d’exécution. Lançons un ANALYSE sur notre table brin_demo et consultons la vue pg_stats, notamment la colonne « correlation » :

SELECT tablename,attname,correlation from pg_stats where tablename='brin_demo';
 tablename | attname | correlation 
-----------+---------+-------------
 brin_demo | c1 | 1

Essayons avec des données aléatoires :

CREATE TABLE brin_random (c1 INT);
INSERT INTO brin_random SELECT trunc(random() * 90 + 1) AS i FROM generate_series(1,100000);
CREATE INDEX brin_random_brin_idx_16 ON brin_random USING brin (c1) WITH (pages_per_range = 16);
ANALYZE brin_random;
SELECT tablename,attname,correlation from pg_stats where tablename='brin_random';
 tablename | attname | correlation 
-------------+---------+-------------
 brin_random | c1 | 0.0063248

=> La vue pg_stats nous indique que les données ne sont pas corrélées avec leur emplacement.

Que contient notre index?

SELECT * FROM brin_page_items(get_raw_page('brin_random_brin_idx_16', 2), 'brin_random_brin_idx_16');
 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value 
------------+--------+--------+----------+----------+-------------+-----------
 1 | 0 | 1 | f | f | f | {1 .. 90}
 2 | 16 | 1 | f | f | f | {1 .. 90}
 3 | 32 | 1 | f | f | f | {1 .. 90}
 4 | 48 | 1 | f | f | f | {1 .. 90}
 5 | 64 | 1 | f | f | f | {1 .. 90}
 6 | 80 | 1 | f | f | f | {1 .. 90}
 7 | 96 | 1 | f | f | f | {1 .. 90}
 8 | 112 | 1 | f | f | f | {1 .. 90}
 9 | 128 | 1 | f | f | f | {1 .. 90}
 10 | 144 | 1 | f | f | f | {1 .. 90}
 11 | 160 | 1 | f | f | f | {1 .. 90}
 12 | 176 | 1 | f | f | f | {1 .. 90}
 13 | 192 | 1 | f | f | f | {1 .. 90}
 14 | 208 | 1 | f | f | f | {1 .. 90}
 15 | 224 | 1 | f | f | f | {1 .. 90}
 16 | 240 | 1 | f | f | f | {1 .. 90}
 17 | 256 | 1 | f | f | f | {1 .. 90}
 18 | 272 | 1 | f | f | f | {1 .. 90}
 19 | 288 | 1 | f | f | f | {1 .. 90}
 20 | 304 | 1 | f | f | f | {1 .. 90}
 21 | 320 | 1 | f | f | f | {1 .. 90}
 22 | 336 | 1 | f | f | f | {1 .. 90}
 23 | 352 | 1 | f | f | f | {1 .. 90}
 24 | 368 | 1 | f | f | f | {1 .. 90}
 25 | 384 | 1 | f | f | f | {1 .. 90}
 26 | 400 | 1 | f | f | f | {1 .. 90}
 27 | 416 | 1 | f | f | f | {1 .. 90}
 28 | 432 | 1 | f | f | f | {1 .. 90}
(28 lignes)

L’index nous indique que tous les blocs contiennent des valeurs comprises entre 1 et 90.

Que donne une recherche des valeurs comprises entre 10 et 20?

EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT c1 FROM brin_random WHERE c1&gt; 10 AND c1&lt;20;
 QUERY PLAN 
-----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.brin_random (cost=118.46..717.27 rows=10387 width=4) (actual time=0.068..10.241 rows=10240 loops=1)
 Output: c1
 Recheck Cond: ((brin_random.c1 &gt; 10) AND (brin_random.c1 &lt; 20))
 Rows Removed by Index Recheck: 89760
 Heap Blocks: lossy=443
 Buffers: shared hit=445
 -&gt; Bitmap Index Scan on brin_random_brin_idx_16 (cost=0.00..115.87 rows=10387 width=0) (actual time=0.052..0.052 rows=4480 loops=1)
 Index Cond: ((brin_random.c1 &gt; 10) AND (brin_random.c1 &lt; 20))
 Buffers: shared hit=2

=> Le moteur lit l’intégralité de la table ainsi que 2 blocs d’index.

Il est légitime de se demander quel est l’intérêt d’utiliser l’index si on sait que les données ne sont pas corrélées. Et qu’au final le moteur sera contraint de lire toute la table.

Allons faire un tour dans le code source. Plus précisément à la ligne 7568 du fichier src/backend/utils/adt/selfuncs.c :

/*
 * BRIN indexes are always read in full; use that as startup cost.
 *
 * XXX maybe only include revmap pages here?
 */
 *indexStartupCost = spc_seq_page_cost * numPages * loop_count;

 /*
 * To read a BRIN index there might be a bit of back and forth over
 * regular pages, as revmap might point to them out of sequential order;
 * calculate this as reading the whole index in random order.
 */
 *indexTotalCost = spc_random_page_cost * numPages * loop_count;

 *indexSelectivity =
 clauselist_selectivity(root, indexQuals,
 path-&gt;indexinfo-&gt;rel-&gt;relid,
 JOIN_INNER, NULL);
 *indexCorrelation = 1;

 /*
 * Add on index qual eval costs, much as in genericcostestimate.
 */
 qual_arg_cost = other_operands_eval_cost(root, qinfos) +
 orderby_operands_eval_cost(root, path);
 qual_op_cost = cpu_operator_cost *
 (list_length(indexQuals) + list_length(indexOrderBys));

 *indexStartupCost += qual_arg_cost;
 *indexTotalCost += qual_arg_cost;
 *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost);

On peut voir « *indexCorrelation = 1; ». En réalité le moteur ignore la corrélation… pour le moment. Une discussion est en cours pour prendre en compte la corrélation dans le coût de l’index : http://www.postgresql.org/message-id/20151116135239.GV614468@alvherre.pgsql

Trions notre table en utilisant l’ordre CLUSTER et un index b-tree :

CREATE INDEX brin_random_btree_idx ON brin_random USING btree (c1);
CLUSTER brin_random USING brin_random_btree_idx;
DROP INDEX brin_random_btree_idx ;
ANALYZE brin_random ;

Vérifions la corrélation :

SELECT tablename,correlation FROM pg_stats WHERE tablename='brin_random';
 tablename | correlation 
-------------+-------------
 brin_random | 1
(1 ligne)

Rejouons notre requête :

EXPLAIN (ANALYZE,BUFFERS,VERBOSE) SELECT c1 FROM brin_random WHERE c1&gt; 10 AND c1&lt;20;
 QUERY PLAN 
----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.brin_random (cost=115.08..708.94 rows=10057 width=4) (actual time=0.113..3.166 rows=9999 loops=1)
 Output: c1
 Recheck Cond: ((brin_random.c1 &gt; 10) AND (brin_random.c1 &lt; 20))
 Rows Removed by Index Recheck: 849
 Heap Blocks: lossy=48
 Buffers: shared hit=50
 -&gt; Bitmap Index Scan on brin_random_brin_idx_16 (cost=0.00..112.57 rows=10057 width=0) (actual time=0.053..0.053 rows=480 loops=1)
 Index Cond: ((brin_random.c1 &gt; 10) AND (brin_random.c1 &lt; 20))
 Buffers: shared hit=2
 Planning time: 0.086 ms
 Execution time: 3.849 ms

Cette fois le moteur a lu moins de blocs, regardons ce que contient notre index :

SELECT * FROM brin_page_items(get_raw_page('brin_random_brin_idx_16', 2), 'brin_random_brin_idx_16');
 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value 
------------+--------+--------+----------+----------+-------------+------------
 1 | 0 | 1 | f | f | f | {1 .. 4}
 2 | 16 | 1 | f | f | f | {4 .. 7}
 3 | 32 | 1 | f | f | f | {7 .. 10}
 4 | 48 | 1 | f | f | f | {10 .. 14}
 5 | 64 | 1 | f | f | f | {14 .. 17}
 6 | 80 | 1 | f | f | f | {17 .. 20}
 7 | 96 | 1 | f | f | f | {20 .. 23}
 8 | 112 | 1 | f | f | f | {23 .. 27}
 9 | 128 | 1 | f | f | f | {27 .. 30}
 10 | 144 | 1 | f | f | f | {30 .. 33}
 11 | 160 | 1 | f | f | f | {33 .. 36}
 12 | 176 | 1 | f | f | f | {36 .. 39}
 13 | 192 | 1 | f | f | f | {39 .. 43}
 14 | 208 | 1 | f | f | f | {43 .. 46}
 15 | 224 | 1 | f | f | f | {46 .. 49}
 16 | 240 | 1 | f | f | f | {49 .. 52}
 17 | 256 | 1 | f | f | f | {52 .. 56}
 18 | 272 | 1 | f | f | f | {56 .. 59}
 19 | 288 | 1 | f | f | f | {59 .. 62}
 20 | 304 | 1 | f | f | f | {62 .. 65}
 21 | 320 | 1 | f | f | f | {65 .. 69}
 22 | 336 | 1 | f | f | f | {69 .. 72}
 23 | 352 | 1 | f | f | f | {72 .. 75}
 24 | 368 | 1 | f | f | f | {75 .. 78}
 25 | 384 | 1 | f | f | f | {78 .. 82}
 26 | 400 | 1 | f | f | f | {82 .. 85}
 27 | 416 | 1 | f | f | f | {85 .. 88}
 28 | 432 | 1 | f | f | f | {88 .. 90}

Cette fois le moteur savait qu’il n’avait à parcourir que les blocs de 48 à 96.

FacebookTwitterGoogle+ViadeoPrintEmailGoogle GmailLinkedInPartager

par Adrien Nayrat le mercredi 20 avril 2016 à 20h57

mardi 19 avril 2016

Damien Clochard

Grande Enquête sur l'écosystème PostgreSQL

Le succès de PostgreSQL ne se dément pas mais il reste encore beaucoup de logiciels qui ne sont pas compatibles avec Postgres. A chaque fois que je tiens un stand PostgreSQL dans un salon ou une conférence, une question classique revient “Est-ce le logiciel X est compatible avec PostgreSQL ?”.

Pour mieux comprendre quels sont les besoins en terme de compatibilité, Takayuki Tsunakawa vient de lancer un grand sondage pour recenser les logiciels qui ne supportente pas encore PostgreSQL. L’occasion de mesurer le chemin à parcourir chez les éditeurs….

Pour que le sondage soit instructif , il faut bien sur qu’un maximum de personnes y répondent !

Vous vous êtes déjà dit : “Si seulement ce logiciel étant compatible avec Postgres…” ? Alors c’est le moment de vous exprimer en répondant au sondage ci-dessous :

PostgreSQL Ecosystem Survey

C’est très simple et ça se passe en 3 étapes :

  1. Choisissez la catégorie du logiciel
  2. Donnez le nom du logiciel
  3. Ajoutez un commentaire si nécessaire

Vous pouvez déclarer autant de logiciels que vous le souhaitez.

Les résultats du sondage sont accessibles en temps réel ici et la discussion à propos de cette initiative a lieu sur la liste pgsql-advocacy

par Damien Clochard le mardi 19 avril 2016 à 22h52

lundi 18 avril 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 17 avril 2016

Session PostgreSQL le 22 septembre 2016 à Lyon (France). La date limite de candidature est le 20 mai : envoyez vos propositions à call-for-paper AT postgresql-sessions DOT org.

[ndt: meetups PostgreSQL à Lyon le 20 avril et à Nantes le 26 avril]

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en avril

PostgreSQL Local

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160417214401.GB7788@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Fujii Masao pushed:

  • Use ereport(ERROR) instead of Assert() to emit syncrep_parser error. The existing code would either Assert or generate an invalid SyncRepConfig variable, neither of which is desirable. A regular error should be thrown instead. This commit silences compiler warning in non assertion-enabled builds. Per report from Jeff Janes. Suggested fix by Tom Lane. http://git.postgresql.org/pg/commitdiff/0038c1e2181b520a9307aae6587e110468072392
  • Remove unused function GetOldestWALSendPointer from walsender code. That unused function was introduced as a sample because synchronous replication or replication monitoring tools might need it in the future. Recently commit 989be08 added the function SyncRepGetOldestSyncRecPtr which provides almost the same functionality for multiple synchronous standbys feature. So it's time to remove that unused sample function. This commit does that. Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/46d73e0d65eef19e25bb0d31f1e5c23ff40a3444 http://git.postgresql.org/pg/commitdiff/cfe96ae24c97ff376157c48ccd5ca6d3938632be
  • Fix duplicated index entry in doc. Commit cfe96ae corrected the name of pg_logical_emit_message() in its index entry. But this typo fix caused duplicated index entry because there was another index entry for the function. Spotted by Tom Lane. http://git.postgresql.org/pg/commitdiff/c8cb7453233b31a177b08a3b2bdac4c31508dc00
  • Make regression test for multiple synchronous standbys more stable. The regression test checks whether the output of pg_stat_replication is expected or not after changing synchronous_standby_names and reloading the configuration file. Regarding this test logic, previously there was a timing issue which made the test result unstable. That is, pg_stat_replication could return unexpected result during small window after the configuration file was reloaded before new setting value took effect, and which made the test fail. This commit changes the test logic so that it uses a loop with a timeout to give some room for the test to pass. Now the test fails only when pg_stat_replication keeps returning unexpected result for 30 seconds. Michael Paquier http://git.postgresql.org/pg/commitdiff/36c1c91604cee164c6487afb99508f7ff8737b96

Peter Eisentraut pushed:

Tom Lane pushed:

  • Fix missing "volatile" in PLy_output(). Commit 5c3c3cd0a3046339 plastered "volatile" on a bunch of variables in PLy_output(), but removed the one that actually mattered, ie the one on "oldcontext". This allows some versions of clang to generate code in which "oldcontext" has been trashed when control reaches the PG_CATCH block. Per buildfarm member tick. http://git.postgresql.org/pg/commitdiff/81ba9348d85fdf87e84cc02112933b592845bda2
  • Fix freshly-introduced PL/Python portability bug. It turns out that those PyErr_Clear() calls I removed from plpy_elog.c in 7e3bb080387f4143 et al were not quite as random as they appeared: they mask a Python 2.3.x bug. (Specifically, it turns out that PyType_Ready() can fail if the error indicator is set on entry, and PLy_traceback's fetch of frame.f_code may be the first operation in a session that requires the "frame" type to be readied. Ick.) Put back the clear call, but in a more centralized place closer to what it's protecting, and this time with a comment warning what it's really for. Per buildfarm member prairiedog. Although prairiedog was only failing on HEAD, it seems clearly possible for this to occur in older branches as well, so back-patch to 9.2 the same as the previous patch. http://git.postgresql.org/pg/commitdiff/1d2f9de38d18152f83cf570581cebac0733ff504
  • Fix two places that thought Windows64 is indicated by WIN64 macro. Everyplace else thinks it's _WIN64, so make these places fall in line. The pg_regress.c usage is not going to result in any change in behavior, only suppressing (or not) a compiler warning about downcasting HANDLEs. So there seems no need for back-patching there. The libpq/win32.mak usage might represent an actual bug, if anyone were using this script to build for Windows64, which perhaps nobody is. Given the lack of field complaints, no back-patch here either. pg_regress.c problem found by Christian Ullrich, the other by me. http://git.postgresql.org/pg/commitdiff/b0e40d189325dc7a54d2546245e766f8c47a7c8d
  • Fix _SPI_execute_plan() for CREATE TABLE IF NOT EXISTS foo AS ... When IF NOT EXISTS was added to CREATE TABLE AS, this logic didn't get the memo, possibly resulting in an Assert failure. It looks like there would have been no ill effects in a non-Assert build, though. Back-patch to 9.5 where the IF NOT EXISTS option was added. Stas Kelvich http://git.postgresql.org/pg/commitdiff/39c283e498de1bb7c3d5beadfffcf3273ae8cc27
  • Remove unnecessary definition of _WIN64 in libpq/win32.mak. In commit b0e40d189325dc7a54d2546245e766f8c47a7c8d, I should have just removed the /D switch defining WIN64. The reason the code worked before is that all Windows64 compilers automatically predefine _WIN64. Perhaps at one time we had code that depended on WIN64 being defined, but it's long gone, and we should not encourage any reappearance. Per discussion with Christian Ullrich. http://git.postgresql.org/pg/commitdiff/e7bcde8ca0d376d9d23d61855baf67122a66c76a
  • In generic WAL application and replay, ensure page "hole" is always zero. The previous coding could allow the contents of the "hole" between pd_lower and pd_upper to diverge during replay from what it had been when the update was originally applied. This would pose a problem if checksums were in use, and in any case would complicate forensic comparisons between master and slave servers. So force the "hole" to contain zeroes, both at initial application of a generically-logged action, and at replay. Alexander Korotkov, adjusted slightly by me http://git.postgresql.org/pg/commitdiff/bdf7db81921deb99fd9d489cbcc635906c89e215
  • Improve API of GenericXLogRegister(). Rename this function to GenericXLogRegisterBuffer() to make it clearer what it does, and leave room for other sorts of "register" actions in future. Also, replace its "bool isNew" argument with an integer flags argument, so as to allow adding more flags in future without an API break. Alexander Korotkov, adjusted slightly by me http://git.postgresql.org/pg/commitdiff/5713f03973e26ad6df6df5ac8b9efa0123d68062
  • Improve coding of column-name parsing in psql's new crosstabview.c. Coverity complained about this code, not without reason because it was rather messy. Adjust it to not scribble on the passed string; that adds one malloc/free cycle per column name, which is going to be insignificant in context. We can actually const-ify both the string argument and the PGresult. Daniel Verité, with some further cleanup by me http://git.postgresql.org/pg/commitdiff/7a5f8b5c59033ac153963f98b9109be9529a824a
  • Redefine create_upper_paths_hook as being invoked once per upper relation. Per discussion, this gives potential users of the hook more flexibility, because they can build custom Paths that implement only one stage of upper processing atop core-provided Paths for earlier stages. http://git.postgresql.org/pg/commitdiff/f1f01de145d0aaca80e6cf8b2ccb7e7f4ed1ad02
  • Improve documentation for \crosstabview. Fix misleading syntax summary (there cannot be a space between colH and scolH). Provide a link from the existing crosstab() function's documentation to \crosstabview. Copy-edit the command's description. Christoph Berg and Tom Lane http://git.postgresql.org/pg/commitdiff/85e004707715f5ee7a6bfc3d03d0fbc837fb2432
  • Fix assorted portability issues with using msync() for data flushing. Commit 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf introduced the use of msync() for flushing dirty data from the kernel's file buffers. Several portability issues were overlooked, though: * Not all implementations of mmap() think that nbytes == 0 means "map the whole file". To fix, use lseek() to find out the true length. Fix callers of pg_flush_data to be aware that nbytes == 0 may result in trashing the file's seek position. * Not all implementations of mmap() will accept partial-page mmap requests. To fix, round down the length request to whatever sysconf() says the page size is. (I think this is OK from a portability standpoint, because sysconf() is required by SUS v2, and we aren't trying to compile this part on Windows anyway. Buildfarm should let us know if not.) * On 32-bit machines, the file size might exceed the available free address space, or even exceed what will fit in size_t. Check for the latter explicitly to avoid passing a false request size to mmap(). If mmap fails, silently fall through to the next implementation method, rather than bleating to the postmaster log and giving up. * mmap'ing directories fails on some platforms, and even if it works, msync'ing the directory is quite unlikely to help, as for that matter are the other flush implementations. In pre_sync_fname(), just skip flush attempts on directories. In passing, copy-edit the comments a bit. Stas Kelvich and myself http://git.postgresql.org/pg/commitdiff/fa11a09fed2b6f483231608866a682ee3a376277
  • Widen amount-to-flush arguments of FileWriteback and callers. It's silly to define these counts as narrower than they might someday need to be. Also, I believe that the BLCKSZ * nflush calculation in mdwriteback was capable of overflowing an int. http://git.postgresql.org/pg/commitdiff/95ef43c4308102d23afa887c9fc28d9977612a2d
  • Fix pg_dump so pg_upgrade'ing an extension with simple opfamilies works. As reported by Michael Feld, pg_upgrade'ing an installation having extensions with operator families that contain just a single operator class failed to reproduce the extension membership of those operator families. This caused no immediate ill effects, but would create problems when later trying to do a plain dump and restore, because the seemingly-not-part-of- the-extension operator families would appear separately in the pg_dump output, and then would conflict with the families created by loading the extension. This has been broken ever since extensions were introduced, and many of the standard contrib extensions are affected, so it's a bit astonishing nobody complained before. The cause of the problem is a perhaps-ill-considered decision to omit such operator families from pg_dump's output on the grounds that the CREATE OPERATOR CLASS commands could recreate them, and having explicit CREATE OPERATOR FAMILY commands would impede loading the dump script into pre-8.3 servers. Whatever the merits of that decision when 8.3 was being written, it looks like a poor tradeoff now. We can fix the pg_upgrade problem simply by removing that code, so that the operator families are dumped explicitly (and then will be properly made to be part of their extensions). Although this fixes the behavior of future pg_upgrade runs, it does nothing to clean up existing installations that may have improperly-linked operator families. Given the small number of complaints to date, maybe we don't need to worry about providing an automated solution for that; anyone who needs to clean it up can do so with manual "ALTER EXTENSION ADD OPERATOR FAMILY" commands, or even just ignore the duplicate-opfamily errors they get during a pg_restore. In any case we need this fix. Back-patch to all supported branches. Discussion: <20228.1460575691@sss.pgh.pa.us> http://git.postgresql.org/pg/commitdiff/6cead413bb92be0579a2dbf6320121edcc32e369
  • Fix broken dependency-mongering for index operator classes/families. For a long time, opclasscmds.c explained that "we do not create a dependency link to the AM [for an opclass or opfamily], because we don't currently support DROP ACCESS METHOD". Commit 473b93287040b200 invented DROP ACCESS METHOD, but it batted only 1 for 2 on adding the dependency links, and 0 for 2 on updating the comments about the topic. In passing, undo the same commit's entirely inappropriate decision to blow away an existing index as a side-effect of create_am.sql. http://git.postgresql.org/pg/commitdiff/92a30a7eb0cadb008e18053f199af7de3fc1abaa
  • Fix prototype of pgwin32_bind(). I (tgl) had copied-and-pasted this from pgwin32_accept(), failing to notice that the third parameter should be "int" not "int *". David Rowley http://git.postgresql.org/pg/commitdiff/22989a8e34168f576e0f90b16fc3edabd28c40e6
  • Provide errno-translation wrappers around bind() and listen() on Windows. I've seen one too many "could not bind IPv4 socket: No error" log entries from the Windows buildfarm members. Per previous discussion, this is likely caused by the fact that we're doing nothing to translate WSAGetLastError() to errno. Put in a wrapper layer to do that. If this works as expected, it should get back-patched, but let's see what happens in the buildfarm first. Discussion: <4065.1452450340@sss.pgh.pa.us> http://git.postgresql.org/pg/commitdiff/d1b7d4877b9a71f476e8e5adea3b6afe419896ba
  • Docs: clarify description of LIMIT/OFFSET behavior. Section 7.6 was a tad confusing because it specified what LIMIT NULL does, but neglected to do the same for OFFSET NULL, making this look like perhaps a special case or a wrong restatement of the bit about LIMIT ALL. Wordsmith a bit while at it. Per bug #14084. http://git.postgresql.org/pg/commitdiff/fda21aa05bdc96c2c4141f5fd1245a11a41cf62c
  • Adjust datatype of ReplicationState.acquired_by. It was declared as "pid_t", which would be fine except that none of the places that printed it in error messages took any thought for the possibility that it's not equivalent to "int". This leads to warnings on some buildfarm members, and could possibly lead to actually wrong error messages on those platforms. There doesn't seem to be any very good reason not to just make it "int"; it's only ever assigned from MyProcPid, which is int. If we want to cope with PIDs that are wider than int, this is not the place to start. Also, fix the comment, which seems to perhaps be a leftover from a time when the field was only a bool? Per buildfarm. Back-patch to 9.5 which has same issue. http://git.postgresql.org/pg/commitdiff/994f11257328e272a6a43d3de59ffa916cbfbe96
  • Adjust signature of walrcv_receive hook. Commit 314cbfc5da988eff redefined the signature of this hook as typedef int (*walrcv_receive_type) (char **buffer, int *wait_fd); But in fact the type of the "wait_fd" variable ought to be pgsocket, which is what WaitLatchOrSocket expects, and which is necessary if we want to be able to assign PGINVALID_SOCKET to it on Windows. So fix that. http://git.postgresql.org/pg/commitdiff/c2dc194bdbf5f84ceb433ed416eb389c1234ebc9
  • Fix core dump in ReorderBufferRestoreChange on alignment-picky platforms. When re-reading an update involving both an old tuple and a new tuple from disk, reorderbuffer.c was careless about whether the new tuple is suitably aligned for direct access --- in general, it isn't. We'd missed seeing this in the buildfarm because the contrib/test_decoding tests exercise this code path only a few times, and by chance all of those cases have old tuples with length a multiple of 4, which is usually enough to make the access to the new tuple's t_len safe. For some still-not-entirely-clear reason, however, Debian's sparc build gets a bus error, as reported by Christoph Berg; perhaps it's assuming 8-byte alignment of the pointer? The lack of previous field reports is probably because you need all of these conditions to trigger a crash: an alignment-picky platform (not Intel), a transaction large enough to spill to disk, an update within that xact that changes a primary-key field and has an odd-length old tuple, and of course logical decoding tracing the transaction. Avoid the alignment assumption by using memcpy instead of fetching t_len directly, and add a test case that exposes the crash on picky platforms. Back-patch to 9.4 where the bug was introduced. Discussion: <20160413094117.GC21485@msg.credativ.de> http://git.postgresql.org/pg/commitdiff/6a3d3965d6d5eec30e1c36b3ffa3355ee9201933
  • Rethink \crosstabview's argument parsing logic. \crosstabview interpreted its arguments in an unusual way, including doing case-insensitive matching of unquoted column names, which is surely not the right thing. Rip that out in favor of doing something equivalent to the dequoting/case-folding rules used by other psql commands. To keep it simple, change the syntax so that the optional sort column is specified as a separate argument, instead of the also-quite-unusual syntax that attached it to the colH argument with a colon. Also, rework the error messages to be closer to project style. http://git.postgresql.org/pg/commitdiff/6f0d6a507889d94a79c0d18577a0cb1ccc2b6815
  • Fix memory leak in GIN index scans. The code had a query-lifespan memory leak when encountering GIN entries that have posting lists (rather than posting trees, ie, there are a relatively small number of heap tuples containing this index key value). With a suitable data distribution this could add up to a lot of leakage. Problem seems to have been introduced by commit 36a35c550, so back-patch to 9.4. Julien Rouhaud http://git.postgresql.org/pg/commitdiff/f0e766bd7f77774075297526bd2da8f3de226c1f
  • Fix portability problem induced by commit a6f6b7819. pg_xlogdump includes bufmgr.h. With a compiler that emits code for static inline functions even when they're unreferenced, that leads to unresolved external references in the new static-inline version of BufferGetPage(). So hide it with #ifndef FRONTEND, as we've done for similar issues elsewhere. Per buildfarm member pademelon. http://git.postgresql.org/pg/commitdiff/6b85d4ba9b09dc94cf1b14aef517da095a83cdbb
  • Fix possible crash in ALTER TABLE ... REPLICA IDENTITY USING INDEX. Careless coding added by commit 07cacba983ef79be could result in a crash or a bizarre error message if someone tried to select an index on the OID column as the replica identity index for a table. Back-patch to 9.4 where the feature was introduced. Discussion: CAKJS1f8TQYgTRDyF1_u9PVCKWRWz+DkieH=U7954HeHVPJKaKg@mail.gmail.com David Rowley http://git.postgresql.org/pg/commitdiff/8f1911d5e6d5a1e62c860ddb040d664b01c6415c
  • Use less-generic names in matview.sql. The original coding of this test used table and view names like "t", "tv", "foo", etc. This tended to interfere with doing simple manual tests in the regression database; not to mention that it posed a considerable risk of conflict with other regression test scripts. Prefix these names with "mvtest_" to avoid such conflicts. Also, change transiently-created role name to be "regress_xxx" per discussions about being careful with regression-test role creation. http://git.postgresql.org/pg/commitdiff/4447f0bcb66547708fa977d6b252046e792a7e04
  • Disallow creation of indexes on system columns (except for OID). Although OID acts pretty much like user data, the other system columns do not, so an index on one would likely misbehave. And it's pretty hard to see a use-case for one, anyway. Let's just forbid the case rather than worry about whether it should be supported. David Rowley http://git.postgresql.org/pg/commitdiff/c34df8a003c3e478d70e8251bd2a24d710b297d4
  • Adjust spin.c's spinlock emulation so that 0 is not a valid spinlock value. We've had repeated troubles over the years with failures to initialize spinlocks correctly; see 6b93fcd14 for a recent example. Most of the time, on most platforms, such oversights can escape notice because all-zeroes is the expected initial content of an slock_t variable. The only platform we have where the initialized state of an slock_t isn't zeroes is HPPA, and that's practically gone in the wild. To make it easier to catch such errors without needing one of those, adjust the --disable-spinlocks code so that zero is not a valid value for an slock_t for it. In passing, remove a bunch of unnecessary #include's from spin.c; commit daa7527afc227443 removed all the intermodule coupling that made them necessary. http://git.postgresql.org/pg/commitdiff/4039c736eb0955cb1daf88e211f105dbbb78f7ea
  • Avoid code duplication in \crosstabview. In commit 6f0d6a507 I added a duplicate copy of psqlscanslash's identifier downcasing code, but actually it's not hard to split that out as a callable subroutine and avoid the duplication. http://git.postgresql.org/pg/commitdiff/9603a32594d2f5e6d9a1f098bc554a68f44ccb3c

Stephen Frost pushed:

  • Correct copyright for newly added genericdesc.c It's 2016 these days (no, not entirely sure how we got here either). Pointed out by Amit Langote http://git.postgresql.org/pg/commitdiff/cd13471f2e9dee6d411cae3ddae72d0ad6b58c4d
  • Disallow SET SESSION AUTHORIZATION pg_* As part of reserving the pg_* namespace for default roles and in line with SET ROLE and other previous efforts, disallow settings the role to a default/reserved role using SET SESSION AUTHORIZATION. These checks and restrictions on what is allowed regarding default / reserved roles are under debate, but it seems prudent to ensure that the existing checks at least cover the intended cases while the debate rages on. On me to clean it up if the consensus decision is to remove these checks. http://git.postgresql.org/pg/commitdiff/bfed4ab824789fd7c000286650d4498dccb05634
  • In recordExtensionInitPriv(), keep the scan til we're done with it For reasons of sheer brain fade, we (I) was calling systable_endscan() immediately after systable_getnext() and expecting the tuple returned by systable_getnext() to still be valid. That's clearly wrong. Move the systable_endscan() down below the tuple usage. Discovered initially by Pavel Stehule and then also by Alvaro. Add a regression test based on Alvaro's testing. http://git.postgresql.org/pg/commitdiff/99f2f3c19ae7d6aa2950a9bdb549217c5a60d941

Kevin Grittner pushed:

  • Make oldSnapshotControl a pointer to a volatile structure It was incorrectly declared as a volatile pointer to a non-volatile structure. Eliminate the OldSnapshotControl struct definition; it is really not needed. Pointed out by Tom Lane. While at it, add OldSnapshotControlData to pgindent's list of structures. http://git.postgresql.org/pg/commitdiff/80647bf65a03e232c995c0826ef394dad8d685fe
  • Use static inline function for BufferGetPage() I was initially concerned that the some of the hundreds of references to BufferGetPage() where the literal BGP_NO_SNAPSHOT_TEST were passed might not optimize as well as a macro, leading to some hard-to-find performance regressions in corner cases. Inspection of disassembled code has shown identical code at all inspected locations, and the size difference doesn't amount to even one byte per such call. So make it readable. Per gripes from Álvaro Herrera and Tom Lane http://git.postgresql.org/pg/commitdiff/a6f6b78196a701702ec4ff6df56c346bdcf9abd2
  • Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0 On a big NUMA machine with 1000 connections in saturation load there was a performance regression due to spinlock contention, for acquiring values which were never used. Just fill with dummy values if we're not going to use them. This patch has not been benchmarked yet on a big NUMA machine, but it seems like a good idea on general principle, and it seemed to prevent an apparent 2.2% regression on a single-socket i7 box running 200 connections at saturation load. http://git.postgresql.org/pg/commitdiff/2201d801b03c2d1b0bce4d6580b718dc34d38b3e

Teodor Sigaev pushed:

Robert Haas pushed:

  • Fix costing for parallel aggregation. The original patch kind of ignored the fact that we were doing something different from a costing point of view, but nobody noticed. This patch fixes that oversight. David Rowley http://git.postgresql.org/pg/commitdiff/deb71fa9713dfe374a74fc58a5d298b5f25da3f5
  • Use PG_INT32_MIN instead of reiterating the constant. Makes no difference, but it's cleaner this way. Michael Paquier http://git.postgresql.org/pg/commitdiff/cbb2a812d710dd58e68088b334f8c492346a0d0f
  • Tweak EXPLAIN for parallel query to show workers launched. The previous display was sort of confusing, because it didn't distinguish between the number of workers that we planned to launch and the number that actually got launched. This has already confused several people, so display both numbers and label them clearly. Julien Rouhaud, reviewed by me. http://git.postgresql.org/pg/commitdiff/5702277ca97396384eaf5c58d582b79b9984ce73
  • postgres_fdw: Clean up handling of system columns. Previously, querying the xmin column of a single postgres_fdw foreign table fetched the tuple length, xmax the typmod, and cmin or cmax the composite type OID of the tuple. However, when you queried several such tables and the join got shipped to the remote side, these columns ended up containing the remote values of the corresponding columns. Both behaviors are rather unprincipled, the former for obvious reasons and the latter because the remote values of these columns don't have any local significance; our transaction IDs are in a different space than those of the remote machine. Clean this up by setting all of these fields to 0 in both cases. Also fix the handling of tableoid to be sane. Robert Haas and Ashutosh Bapat, reviewed by Etsuro Fujita. http://git.postgresql.org/pg/commitdiff/da7d44b627ba839de32c9409aca659f60324de76

Andres Freund pushed:

  • void atomic operation in MarkLocalBufferDirty(). The recent patch to make Pin/UnpinBuffer lockfree in the hot path (48354581a), accidentally used pg_atomic_fetch_or_u32() in MarkLocalBufferDirty(). Other code operating on local buffers was careful to only use pg_atomic_read/write_u32 which just read/write from memory; to avoid unnecessary overhead. On its own that'd just make MarkLocalBufferDirty() slightly less efficient, but in addition InitLocalBuffers() doesn't call pg_atomic_init_u32() - thus the spinlock fallback for the atomic operations isn't initialized. That in turn caused, as reported by Tom, buildfarm animal gaur to fail. As those errors are actually useful against this type of error, continue to omit - intentionally this time - initialization of the atomic variable. In addition, add an explicit note about only using pg_atomic_read/write on local buffers's state to BufferDesc's description. Reported-By: Tom Lane Discussion: 1881.1460431476@sss.pgh.pa.us http://git.postgresql.org/pg/commitdiff/6b93fcd149329d4ee7319561b30fc15a573c6307
  • Make init_spin_delay() C89 compliant and change stuck spinlock reporting. The current definition of init_spin_delay (introduced recently in 48354581a) wasn't C89 compliant. It's not legal to refer to refer to non-constant expressions, and the ptr argument was one. This, as reported by Tom, lead to a failure on buildfarm animal pademelon. The pointer, especially on system systems with ASLR, isn't super helpful anyway, though. So instead of making init_spin_delay into an inline function, make s_lock_stuck() report the function name in addition to file:line and change init_spin_delay() accordingly. While not a direct replacement, the function name is likely more useful anyway (line numbers are often hard to interpret in third party reports). This also fixes what file/line number is reported for waits via s_lock(). As PG_FUNCNAME_MACRO is now used outside of elog.h, move it to c.h. Reported-By: Tom Lane Discussion: 4369.1460435533@sss.pgh.pa.us http://git.postgresql.org/pg/commitdiff/80abbeba23d466b6541cf95082a9e1f36704424e
  • Add required database and origin filtering for logical messages. Logical messages, added in 3fe3511d05, during decoding failed to filter messages emitted in other databases and messages emitted "under" a replication origin the output plugin isn't interested in. Add tests to verify that both types of filtering actually work. While touching message.sql remove hunk obsoleted by d25379e. Bump XLOG_PAGE_MAGIC because xl_logical_message changed and because 3fe3511d05 had omitted doing so. 3fe3511d05 additionally didn't bump catversion, but 7a542700d has done so since. Author: Petr Jelinek Reported-By: Andres Freund Discussion: 20160406142513.wotqy3ba3kanr423@alap3.anarazel.de http://git.postgresql.org/pg/commitdiff/be65eddd80093a923b091dc60776aa6f966d1f07
  • Remove trailing commas in enums. These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These exist in differing places in most supported branches. http://git.postgresql.org/pg/commitdiff/533cd2303aa6558721e76295fd1ffb05211764f9
  • Make init_spin_delay() C89 compliant #2. My previous attempt at doing so, in 80abbeba23, was not sufficient. While that fixed the problem for bufmgr.c and lwlock.c , s_lock.c still has non-constant expressions in the struct initializer, because the file/line/function information comes from the caller of s_lock(). Give up on using a macro, and use a static inline instead. Discussion: 4369.1460435533@sss.pgh.pa.us http://git.postgresql.org/pg/commitdiff/4b74c6a40e7ac9dad7cdeb4cfd2d51ea60cfdbb5
  • Fix trivial typo. http://git.postgresql.org/pg/commitdiff/7b16781228d6c0a2db66d71e33e64b9606779feb

Magnus Hagander pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Michaël Paquier sent in two more revisions of a patch to add VS 2015 support to MSVC.

Robert Haas sent in a patch to fix some alignment issues in src/backend/storage/buffer/buf_init.c.

Stas Kelvich and Michaël Paquier traded patches to speed up two-phase commits.

Etsuro Fujita sent in a patch to improve the way foreign tables get written to in the PostgreSQL FDW.

Anastasia Lubennikova sent in another revision of a patch to implement covering unique indexes.

Amit Kapila sent in a patch to pad the PGXACT struct out to 64 bytes.

Teodor Sigaev sent in a patch to fix some GIN index corruption bugs.

David Rowley sent in two revisions of a patch to fix some issues with EXPLAIN output for parallel aggregates.

Ashutosh Sharma fix an issue with pg_basebackup's handling of symlinks.

Peter Geoghegan sent in a patch to remove an obsolete comment from fmgr.c.

Amit Langote sent in another revision of a patch to implement declarative partitioning.

Michaël Paquier sent in three more revisions of a patch to fix interrupt handling in the PostgreSQL FDW.

David Rowley sent in three revisions of a patch to disallow creating unique indexes on system columns where that doesn't make sense.

Etsuro Fujita sent in a patch to fix an issue where a combination of FULL and INNER joins could produce a wrong result with the PostgreSQL FDW.

Feike Steenbergen sent in a patch to make pg_get_functiondef actually add parallel indicators.

Piotr Stefaniak sent in another revision of a patch to avoid passing null pointers to functions that require the pointers to be non null.

Magnus Hagander sent in a patch to fix some issues with the backup docs.

Craig Ringer sent in a patch to enable logical timeline following in the walsender.

Terence Ferraro sent in a patch to make it possible for an environment variable to be provided to libpq to specify where to find the SSL certificate/key files used for a secure connection.

par N Bougain le lundi 18 avril 2016 à 22h06

vendredi 15 avril 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 10 avril 2016

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en avril

PostgreSQL Local

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160411051000.GA17121@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Clean up dubious code in contrib/seg. The restore() function assumed that the result of sprintf() with %e format would necessarily contain an 'e', which is false: what if the supplied number is an infinity or NaN? If that did happen, we'd get a null-pointer-dereference core dump. The case appears impossible currently, because seg_in() does not accept such values, and there are no seg-creating functions that would create one. But it seems unwise to rely on it never happening in future. Quite aside from that, the code was pretty ugly: it relied on modifying a static format string when it could use a "*" precision argument, and it used strtok() entirely gratuitously, and it stripped off trailing spaces by hand instead of just not asking for them to begin with. Coverity noticed the potential null pointer dereference (though I wonder why it didn't complain years ago, since this code is ancient). Since this is just code cleanup and forestalling a hypothetical future bug, there seems no need for back-patching. http://git.postgresql.org/pg/commitdiff/a75a418d07bf852dc9fdb85ccfb39c763aa057a9
  • Fix latent portability issue in pgwin32_dispatch_queued_signals(). The first iteration of the signal-checking loop would compute sigmask(0) which expands to 1<<(-1) which is undefined behavior according to the C standard. The lack of field reports of trouble suggest that it evaluates to 0 on all existing Windows compilers, but that's hardly something to rely on. Since signal 0 isn't a queueable signal anyway, we can just make the loop iterate from 1 instead, and save a few cycles as well as avoiding the undefined behavior. In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE twice in a row; there's no reason to waste cycles like that. Noted by Aleksander Alekseev, though this isn't his proposed fix. Back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/58666ed28ab59a2686ee08bc648b4e9959aacfce
  • Introduce a LOG_SERVER_ONLY ereport level, which is never sent to client. This elevel is useful for logging audit messages and similar information that should not be passed to the client. It's equivalent to LOG in terms of decisions about logging priority in the postmaster log, but messages with this elevel will never be sent to the client. In the current implementation, it's just an alias for the longstanding COMMERROR elevel (or more accurately, we've made COMMERROR an alias for this). At some point it might be interesting to allow a LOG_ONLY flag to be attached to any elevel, but that would be considerably more complicated, and it's not clear there's enough use-cases to justify the extra work. For now, let's just take the easy 90% solution. David Steele, reviewed by Fabien Coelho, Petr Jelínek, and myself http://git.postgresql.org/pg/commitdiff/66229ac0040cf1e0f5b9d72271aa9feaf3b3a37e
  • Add a \gexec command to psql for evaluation of computed queries. \gexec executes the just-entered query, like \g, but instead of printing the results it takes each field as a SQL command to send to the server. Computing a series of queries to be executed is a fairly common thing, but up to now you always had to resort to kluges like writing the queries to a file and then inputting the file. Now it can be done with no intermediate step. The implementation is fairly straightforward except for its interaction with FETCH_COUNT. ExecQueryUsingCursor isn't capable of being called recursively, and even if it were, its need to create a transaction block interferes unpleasantly with the desired behavior of \gexec after a failure of a generated query (i.e., that it can continue). Therefore, disable use of ExecQueryUsingCursor when doing the master \gexec query. We can still apply it to individual generated queries, however, and there might be some value in doing so. While testing this feature's interaction with single-step mode, I (tgl) was led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed) as a negative response to the single-step prompt. Perhaps that's a back-patchable bug fix, but for now I just included it here. Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself http://git.postgresql.org/pg/commitdiff/2bbe9112aec60abc2d3b4c39e75d0cbdcaaa45e1
  • Add a few comments about ANALYZE's strategy for collecting MCVs. Alex Shulgin complained that the underlying strategy wasn't all that apparent, particularly not the fact that we intentionally have two code paths depending on whether we think the column has a limited set of possible values or not. Try to make it clearer. http://git.postgresql.org/pg/commitdiff/3c69b33f459f62fe6db66c386ef12620ea697f74
  • Partially revert commit 3d3bf62f30200500637b24fdb7b992a99f9704c3. On reflection, the pre-existing logic in ANALYZE is specifically meant to compare the frequency of a candidate MCV against the estimated frequency of a random distinct value across the whole table. The change to compare it against the average frequency of values actually seen in the sample doesn't seem very principled, and if anything it would make us less likely not more likely to consider a value an MCV. So revert that, but keep the aspect of considering only nonnull values, which definitely is correct. In passing, rename the local variables in these stanzas to "ndistinct_table", to avoid confusion with the "ndistinct" that appears at an outer scope in compute_scalar_stats. http://git.postgresql.org/pg/commitdiff/391159e03a8b69dd04a1432ceb800c7c4c3d608c
  • Disallow newlines in parameter values to be set in ALTER SYSTEM. As noted by Julian Schauder in bug #14063, the configuration-file parser doesn't support embedded newlines in string literals. While there might someday be a good reason to remove that restriction, there doesn't seem to be one right now. However, ALTER SYSTEM SET could accept strings containing newlines, since many of the variable-specific value-checking routines would just see a newline as whitespace. This led to writing a postgresql.auto.conf file that was broken and had to be removed manually. Pending a reason to work harder, just throw an error if someone tries this. In passing, fix several places in the ALTER SYSTEM logic that failed to provide an errcode() for an ereport(), and thus would falsely log the failure as an internal XX000 error. Back-patch to 9.4 where ALTER SYSTEM was introduced. http://git.postgresql.org/pg/commitdiff/99f3b5613bd1f145b5dbbe86000337bbe37fb094
  • Fix PL/Python for recursion and interleaved set-returning functions. PL/Python failed if a PL/Python function was invoked recursively via SPI, since arguments are passed to the function in its global dictionary (a horrible decision that's far too ancient to undo) and it would delete those dictionary entries on function exit, leaving the outer recursion level(s) without any arguments. Not deleting them would be little better, since the outer levels would then see the innermost level's arguments. Since PL/Python uses ValuePerCall mode for evaluating set-returning functions, it's possible for multiple executions of the same SRF to be interleaved within a query. PL/Python failed in such a case, because it stored only one iterator per function, directly in the function's PLyProcedure struct. Moreover, one interleaved instance of the SRF would see argument values that should belong to another. Hence, invent code for saving and restoring the argument entries. To fix the recursion case, we only need to save at recursive entry and restore at recursive exit, so the overhead in non-recursive cases is negligible. To fix the SRF case, we have to save when suspending a SRF and restore when resuming it, which is potentially not negligible; but fortunately this is mostly a matter of manipulating Python object refcounts and should not involve much physical data copying. Also, store the Python iterator and saved argument values in a structure associated with the SRF call site rather than the function itself. This requires adding a memory context deletion callback to ensure that the SRF state is cleaned up if the calling query exits before running the SRF to completion. Without that we'd leak a refcount to the iterator object in such a case, resulting in session-lifespan memory leakage. (In the pre-existing code, there was no memory leak because there was only one iterator pointer, but what would happen is that the previous iterator would be resumed by the next query attempting to use the SRF. Hardly the semantics we want.) We can buy back some of whatever overhead we've added by getting rid of PLy_function_delete_args(), which seems a useless activity: there is no need to delete argument entries from the global dictionary on exit, since the next time anyone would see the global dict is on the next fresh call of the PL/Python function, at which time we'd overwrite those entries with new arg values anyway. Also clean up some really ugly coding in the SRF implementation, including such gems as returning directly out of a PG_TRY block. (The only reason that failed to crash hard was that all existing call sites immediately exited their own PG_TRY blocks, popping the dangling longjmp pointer before there was any chance of it being used.) In principle this is a bug fix; but it seems a bit too invasive relative to its value for a back-patch, and besides the fix depends on memory context callbacks so it could not go back further than 9.5 anyway. Alexey Grishchenko and Tom Lane http://git.postgresql.org/pg/commitdiff/1d2fe56e42640613781fc17ab1534fd0551de9bd
  • Run pgindent on a batch of (mostly-planner-related) source files. Getting annoyed at the amount of unrelated chatter I get from pgindent'ing Rowley's unique-joins patch. Re-indent all the files it touches. http://git.postgresql.org/pg/commitdiff/de94e2af184e25576b13cbda8cf825118835d1cd
  • Refactor join_is_removable() to separate out distinctness-proving logic. Extracted from pending unique-join patch, since this is a rather large delta but it's simply moving code out into separately-accessible subroutines. I (tgl) did choose to add a bit more logic to rel_supports_distinctness, so that it verifies that there's at least one potentially usable unique index rather than just checking indexlist != NIL. Otherwise there's no functional change here. David Rowley http://git.postgresql.org/pg/commitdiff/f338dd7585cab45da9053e883ad65a440a99d3be
  • Fix multiple bugs in tablespace symlink removal. Don't try to examine S_ISLNK(st.st_mode) after a failed lstat(). It's undefined. Also, if the lstat() reported ENOENT, we do not wish that to be a hard error, but the code might nonetheless treat it as one (giving an entirely misleading error message, too) depending on luck-of-the-draw as to what S_ISLNK() returned. Don't throw error for ENOENT from rmdir(), either. (We're not really expecting ENOENT because we just stat'd the file successfully; but if we're going to allow ENOENT in the symlink code path, surely the directory code path should too.) Generate an appropriate errcode for its-the-wrong-type-of-file complaints. (ERRCODE_SYSTEM_ERROR doesn't seem appropriate, and failing to write errcode() around it certainly doesn't work, and not writing an errcode at all is not per project policy.) Valgrind noticed the undefined S_ISLNK result; the other problems emerged while reading the code in the area. All of this appears to have been introduced in 8f15f74a44f68f9c. Back-patch to 9.5 where that commit appeared. http://git.postgresql.org/pg/commitdiff/93c301fc4ff7d4f06bff98fea8db47ce67f28155
  • Add BSD authentication method. Create a "bsd" auth method that works the same as "password" so far as clients are concerned, but calls the BSD Authentication service to check the password. This is currently only available on OpenBSD. Marisa Emerson, reviewed by Thomas Munro http://git.postgresql.org/pg/commitdiff/34c33a1f00259ce5e3e1d1b4a784037adfca6057
  • Fix unstable regression test output. Output order from the pg_indexes view might vary depending on the phase of the moon, so add ORDER BY to ensure stable results of tests added by commit 386e3d7609c49505e079c40c65919d99feb82505. Per buildfarm. http://git.postgresql.org/pg/commitdiff/690c543550b0d2852060c18d270cdb534d339d9a
  • Run pgindent on generic_xlog.c. This code desperately needs some micro-optimization, and I'd like it to be formatted a bit more nicely while I work on it. http://git.postgresql.org/pg/commitdiff/2dd318d277b8e1d8269b030f545240193943162f
  • Code review/prettification for generic_xlog.c. Improve commentary, use more specific names for the delta fields, const-ify pointer arguments where possible, avoid assuming that initializing only the first element of a local array will guarantee that the remaining elements end up as we need them. (I think that code in generic_redo actually worked, but only because InvalidBuffer is zero; this is a particularly ugly way of depending on that ...) http://git.postgresql.org/pg/commitdiff/db03cf375d602e417eda6b7a55eead91618e1398
  • Get rid of blinsert()'s use of GenericXLogUnregister(). That routine is dangerous, and unnecessary once we get rid of this one caller. In passing, fix failure to clean up temp memory context, or switch back to caller's context, during slowest exit path. http://git.postgresql.org/pg/commitdiff/80cf18910c8edf2575c306dde9ead192bdb0863a
  • Get rid of GenericXLogUnregister(). This routine is unsafe as implemented, because it invalidates the page image pointers returned by previous GenericXLogRegister() calls. Rather than complicate the API or the implementation to avoid that, let's just get rid of it; the use-case for having it seems much too thin to justify a lot of work here. While at it, do some wordsmithing on the SGML docs for generic WAL. http://git.postgresql.org/pg/commitdiff/08e785436f84f8824149a2182b0cb9ce2c28e31d
  • Fix PL/Python ereport() test to work on Python 2.3. Per buildfarm. Pavel Stehule http://git.postgresql.org/pg/commitdiff/c7a141a9866b8c15d9e3b6fd5310e54837900394
  • Micro-optimize GenericXLogFinish(). Make the inner comparison loops of computeDelta() as tight as possible by pulling considerations of valid and invalid ranges out of the inner loops, and extending a match or non-match detection as far as possible before deciding what to do next. To keep this tractable, give up the possibility of merging fragments across the pd_lower to pd_upper gap. The fraction of pages where that could happen (ie, there are 4 or fewer bytes in the gap, *and* data changes immediately adjacent to it on both sides) is too small to be worth spending cycles on. Also, avoid two BLCKSZ-length memcpy()s by computing the delta before moving data into the target buffer, instead of after. This doesn't save nearly as many cycles as being tenser about computeDelta(), but it still seems worth doing. On my machine, this patch cuts a full 40% off the runtime of contrib/bloom's regression test. http://git.postgresql.org/pg/commitdiff/68689c66efcda6f217119432edfbdf95a50b26e2
  • Further minor improvement in generic_xlog.c: always say REGBUF_STANDARD. Since we're requiring pages handled by generic_xlog.c to be standard format, specify REGBUF_STANDARD when doing a full-page image, so that xloginsert.c can compress out the "hole" between pd_lower and pd_upper. Given the current API in which this path will be taken only for a newly initialized page, the hole is likely to be particularly large in such cases, so that this oversight could easily be performance-significant. I don't notice any particular change in the runtime of contrib/bloom's regression test, though. http://git.postgresql.org/pg/commitdiff/660d5fb856c61df2de2cedb26249404ffc58cb89
  • Improve contrib/bloom regression test using code coverage info. Originally, this test created a 100000-row test table, which made it run rather slowly compared to other contrib tests. Investigation with gcov showed that we got no further improvement in code coverage after the first 700 or so rows, making the large table 99% a waste of time. Cut it back to 2000 rows to fix the runtime problem and still leave some headroom for testing behaviors that may appear later. A closer look at the gcov results showed that the main coverage omissions in contrib/bloom occurred because the test never filled more than one entry in the notFullPage array; which is unsurprising because it exercised index cleanup only in the scenario of complete table deletion, allowing every page in the index to become deleted rather than not-full. Add testing that allows the not-full path to be exercised as well. Also, test the amvalidate function, because blvalidate.c had zero coverage without that, and besides it's a good idea to check for mistakes in the bloom opclass definitions. http://git.postgresql.org/pg/commitdiff/cf223c3bf5ba16232147c66b5fef4037aafe747c
  • Fix access-to-already-freed-memory issue in plpython's error handling. PLy_elog() could attempt to access strings that Python had already freed, because the strings that PLy_get_spi_error_data() returns are simply pointers into storage associated with the error "val" PyObject. That's fine at the instant PLy_get_spi_error_data() returns them, but just after that PLy_traceback() intentionally releases the only refcount on that object, allowing it to be freed --- so that the strings we pass to ereport() are dangling pointers. In principle this could result in garbage output or a coredump. In practice, I think the risk is pretty low, because there are no Python operations between where we decrement that refcount and where we use the strings (and copy them into PG storage), and thus no reason for Python to recycle the storage. Still, it's clearly hazardous, and it leads to Valgrind complaints when running under a Valgrind that hasn't been lobotomized to ignore Python memory allocations. The code was a mess anyway: we fetched the error data out of Python (clearing Python's error indicator) with PyErr_Fetch, examined it, pushed it back into Python with PyErr_Restore (re-setting the error indicator), then immediately pulled it back out with another PyErr_Fetch. Just to confuse matters even more, there were some gratuitous-and-yet-hazardous PyErr_Clear calls in the "examine" step, and we didn't get around to doing PyErr_NormalizeException until after the second PyErr_Fetch, making it even less clear which object was being manipulated where and whether we still had a refcount on it. (If PyErr_NormalizeException did substitute a different "val" object, it's possible that the problem could manifest for real, because then we'd be doing assorted Python stuff with no refcount on the object we have string pointers into.) So, rearrange all that into some semblance of sanity, and don't decrement the refcount on the Python error objects until the end of PLy_elog(). In HEAD, I failed to resist the temptation to reformat some messy bits from 5c3c3cd0a3046339 along the way. Back-patch as far as 9.2, because the code is substantially the same that far back. I believe that 9.1 has the bug as well; but the code around it is rather different and I don't want to take a chance on breaking something for what seems a low-probability problem. http://git.postgresql.org/pg/commitdiff/7e3bb080387f4143cdc908bf97daf9a8abdc445f
  • Clean up foreign-key caching code in planner. Coverity complained that the code added by 015e88942aa50f0d lacked an error check for SearchSysCache1 failures, which it should have. But the code was pretty duff in other ways too, including failure to think about whether it could really cope with arrays of different lengths. http://git.postgresql.org/pg/commitdiff/5306df2831ab012d8008691f833457bc299962aa
  • pg_dump: add missing "destroyPQExpBuffer(query)" in dumpForeignServer(). Coverity complained about this resource leak (why now, I don't know, since it's been like that a long time). Our general policy in pg_dump is that PQExpBuffers are worth cleaning up, so do it here too. But don't bother with a back-patch, because it seems unlikely that very many databases contain enough FOREIGN SERVER objects to notice. http://git.postgresql.org/pg/commitdiff/074050f16a2db9b5ebe5c9f8fdb211cbb810e746
  • Add comment about intentional fallthrough in switch. Coverity complained about an apparent missing "break" in a switch added by bb140506df605fab. The human-readable comments are pretty clear that this is intentional, but add a standard /* FALL THRU */ comment to make it clear to tools too. http://git.postgresql.org/pg/commitdiff/1630f5b92a3a00aff5674f31af1d418628a00ac7
  • Fix poorly thought-through code from commit 5c3c3cd0a3046339. It's not entirely clear to me whether PyString_AsString can return null (looks like the answer might vary between Python 2 and 3). But in any case, this code's attempt to cope with the possibility was quite broken, because pstrdup() neither allows a null argument nor ever returns a null. Moreover, the code below this point assumes that "message" is a palloc'd string, which would not be the case for a dgettext result. Fix both problems by doing the pstrdup step separately. http://git.postgresql.org/pg/commitdiff/f73b2bbbdcb387aa90ff619fe03d1924ed82b868

Dean Rasheed pushed:

  • Improve estimate of distinct values in estimate_num_groups(). When adjusting the estimate for the number of distinct values from a rel in a grouped query to take into account the selectivity of the rel's restrictions, use a formula that is less likely to produce under-estimates. The old formula simply multiplied the number of distinct values in the rel by the restriction selectivity, which would be correct if the restrictions were fully correlated with the grouping expressions, but can produce significant under-estimates in cases where they are not well correlated. The new formula is based on the random selection probability, and so assumes that the restrictions are not correlated with the grouping expressions. This is guaranteed to produce larger estimates, and of course risks over-estimating in cases where the restrictions are correlated, but that has less severe consequences than under-estimating, which might lead to a HashAgg that consumes an excessive amount of memory. This could possibly be improved upon in the future by identifying correlated restrictions and using a hybrid of the old and new formulae. Author: Tomas Vondra, with some hacking be me Reviewed-by: Mark Dilger, Alexander Korotkov, Dean Rasheed and Tom Lane Discussion: http://www.postgresql.org/message-id/flat/56CD0381.5060502@2ndquadrant.com http://git.postgresql.org/pg/commitdiff/84f9a35e398f863c62440d3f82fc57b4fedc5d08

Teodor Sigaev pushed:

Ãlvaro Herrera pushed:

Peter Eisentraut pushed:

  • Fix error message from wal_level value renaming found by Ian Barwick. http://git.postgresql.org/pg/commitdiff/4dcd4da98c786c48b0dbf129c8f7ea592c34a185
  • pg_dump: Add table qualifications to some tags. Some object types have names that are only unique for one table. But for those we generally didn't put the table name into the dump TOC tag. So it was impossible to identify these objects if the same name was used for multiple tables. This affects policies, column defaults, constraints, triggers, and rules. Fix by adding the table name to the TOC tag, so that it now reads "$schema $table $object". Reviewed-by: Michael Paquier <michael.paquier@gmail.com> http://git.postgresql.org/pg/commitdiff/3b3fcc4eeaeecff315420833975e7c87d760bfe1
  • Set PAM_RHOST item for PAM authentication. The PAM_RHOST item is set to the remote IP address or host name and can be used by PAM modules. A pg_hba.conf option is provided to choose between IP address and resolved host name. From: Grzegorz Sampolski <grzsmp@gmail.com> Reviewed-by: Haribabu Kommi <kommi.haribabu@gmail.com> http://git.postgresql.org/pg/commitdiff/2f1d2b7a75fecad25295cb3f453503eb6a176d4f
  • Fix printf format. http://git.postgresql.org/pg/commitdiff/8b737f90843157706b8b5eb401b2aff08da77781
  • Replace printf format %i by %d. see also ce8d7bb6440710058503d213b2aafcdf56a5b481 http://git.postgresql.org/pg/commitdiff/339025c68f95d3cb2c42478109cafeaf414c7fe0
  • Distrust external OpenSSL clients; clear err queue. OpenSSL has an unfortunate tendency to mix per-session state error handling with per-thread error handling. This can cause problems when programs that link to libpq with OpenSSL enabled have some other use of OpenSSL; without care, one caller of OpenSSL may cause problems for the other caller. Backend code might similarly be affected, for example when a third party extension independently uses OpenSSL without taking the appropriate precautions. To fix, don't trust other users of OpenSSL to clear the per-thread error queue. Instead, clear the entire per-thread queue ahead of certain I/O operations when it appears that there might be trouble (these I/O operations mostly need to call SSL_get_error() to check for success, which relies on the queue being empty). This is slightly aggressive, but it's pretty clear that the other callers have a very dubious claim to ownership of the per-thread queue. Do this is both frontend and backend code. Finally, be more careful about clearing our own error queue, so as to not cause these problems ourself. It's possibly that control previously did not always reach SSLerrmessage(), where ERR_get_error() was supposed to be called to clear the queue's earliest code. Make sure ERR_get_error() is always called, so as to spare other users of OpenSSL the possibility of similar problems caused by libpq (as opposed to problems caused by a third party OpenSSL library like PHP's OpenSSL extension). Again, do this is both frontend and backend code. See bug #12799 and https://bugs.php.net/bug.php?id=68276 Based on patches by Dave Vitek and Peter Eisentraut. From: Peter Geoghegan <pg@bowt.ie> http://git.postgresql.org/pg/commitdiff/7c7d4fddab82dc756d8caa67b1b31fcdde355aab

Magnus Hagander pushed:

  • Fix typo. Etsuro Fujita http://git.postgresql.org/pg/commitdiff/9457b591b949d3c256dd91043df71fb11657227a
  • Implement backup API functions for non-exclusive backups. Previously non-exclusive backups had to be done using the replication protocol and pg_basebackup. With this commit it's now possible to make them using pg_start_backup/pg_stop_backup as well, as long as the backup program can maintain a persistent connection to the database. Doing this, backup_label and tablespace_map are returned as results from pg_stop_backup() instead of being written to the data directory. This makes the server safe from a crash during an ongoing backup, which can be a problem with exclusive backups. The old syntax of the functions remain and work exactly as before, but since the new syntax is safer this should eventually be deprecated and removed. Only reference documentation is included. The main section on backup still needs to be rewritten to cover this, but since that is already scheduled for a separate large rewrite, it's not included in this patch. Reviewed by David Steele and Amit Kapila http://git.postgresql.org/pg/commitdiff/7117685461af50f50c03f43e6a622284c8d54694
  • Add authentication parameters compat_realm and upn_usename for SSPI. These parameters are available for SSPI authentication only, to make it possible to make it behave more like "normal gssapi", while making it possible to maintain compatibility. compat_realm is on by default, but can be turned off to make the authentication use the full Kerberos realm instead of the NetBIOS name. upn_username is off by default, and can be turned on to return the users Kerberos UPN rather than the SAM-compatible name (a user in Active Directory can have both a legacy SAM-compatible username and a new Kerberos one. Normally they are the same, but not always) Author: Christian Ullrich Reviewed by: Robbie Harwood, Alvaro Herrera, me http://git.postgresql.org/pg/commitdiff/35e2e357cb054dc9e5d890fe754c56f0722f015e

Robert Haas pushed:

Fujii Masao pushed:

  • Support multiple synchronous standby servers. Previously synchronous replication offered only the ability to confirm that all changes made by a transaction had been transferred to at most one synchronous standby server. This commit extends synchronous replication so that it supports multiple synchronous standby servers. It enables users to consider one or more standby servers as synchronous, and increase the level of transaction durability by ensuring that transaction commits wait for replies from all of those synchronous standbys. Multiple synchronous standby servers are configured in synchronous_standby_names which is extended to support new syntax of 'num_sync ( standby_name [ , ... ] )', where num_sync specifies the number of synchronous standbys that transaction commits need to wait for replies from and standby_name is the name of a standby server. The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before is also still supported. It's the same as new syntax with num_sync=1. This commit doesn't include "quorum commit" feature which was discussed in pgsql-hackers. Synchronous standbys are chosen based on their priorities. synchronous_standby_names determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. The regression test for multiple synchronous standbys is not included in this commit. It should come later. Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs, Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen, Rajeev Rastogi Many thanks to the various individuals who were involved in discussing and developing this feature. http://git.postgresql.org/pg/commitdiff/989be0810dffd08b54e1caecec0677608211c339
  • Use proper format specifier %X/%X for LSN, again. Commit cee31f5 fixed this problem, but commit 989be08 accidentally reverted the fix. Thomas Munro http://git.postgresql.org/pg/commitdiff/ead9963c471ccde50ff220e8294ea11a57eee91c
  • Fix a couple of places in doc that implied there was only one sync standby. Thomas Munro http://git.postgresql.org/pg/commitdiff/8643b91ecf8f47a1307df4a00d66b2fceada0d6f
  • Add regression tests for multiple synchronous standbys. Authors: Suraj Kharage, Michael Paquier, Masahiko Sawada, refactored by me Reviewed-By: Kyotaro Horiguchi http://git.postgresql.org/pg/commitdiff/196b72fb9a5556c66f2b012cc4e869175a3049fa

Simon Riggs pushed:

  • Revert bf08f2292ffca14fd133aa0901d1563b6ecd6894. Remove recent changes to logging XLOG_RUNNING_XACTS by request. http://git.postgresql.org/pg/commitdiff/cac0e36682970ec1276d3da3d3ee37325544a2bb
  • Avoid archiving XLOG_RUNNING_XACTS on idle server. If archive_timeout > 0 we should avoid logging XLOG_RUNNING_XACTS if idle. Bug 13685 reported by Laurence Rowe, investigated in detail by Michael Paquier, though this is not his proposed fix. 20151016203031.3019.72930@wrigleys.postgresql.org Simple non-invasive patch to allow later backpatch to 9.4 and 9.5 http://git.postgresql.org/pg/commitdiff/bf08f2292ffca14fd133aa0901d1563b6ecd6894
  • Modify test_decoding/messages to remove non-ascii chars http://git.postgresql.org/pg/commitdiff/d25379eb23383f1d2f969e65e0332b47c19aea94
  • Generic Messages for Logical Decoding. API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com http://git.postgresql.org/pg/commitdiff/3fe3511d05127cc024b221040db2eeb352e7d716
  • Load FK defs into relcache for use by planner. Fastpath ignores this if no triggers defined. Author: Tomas Vondra, with fastpath and comments added by me Reviewers: David Rowley, Simon Riggs http://git.postgresql.org/pg/commitdiff/015e88942aa50f0d419ddac00e63bb06d6e62e86
  • Use Foreign Key relationships to infer multi-column join selectivity. In cases where joins use multiple columns we currently assess each join separately causing gross mis-estimates for join cardinality. This patch adds use of FK information for the first time into the planner. When FKs are present and we have multi-column join information, plan estimates will be drastically improved. Cases with multiple FKs are handled, though partial matches are ignored currently. Net effect is substantial performance improvements for joins in many common cases. Additional planning time is isolated to cases that are currently performing poorly, measured at 0.08 - 0.15 ms. Please watch for planner performance regressions; circumstances seem unlikely but the law of unintended consequences may apply somewhen. Additional complex tests welcome to prove this before release. Tests can be performed using SET enable_fkey_estimates = on | off using scripts provided during Hackers discussions, message id: 552335D9.3090707@2ndquadrant.com Authors: Tomas Vondra and David Rowley Reviewed and tested by Simon Riggs, adding comments only http://git.postgresql.org/pg/commitdiff/137805f89acb361144ec98d9847e26d2848aa57e

Stephen Frost pushed:

  • Add new catalog called pg_init_privs. This new catalog holds the privileges which the system was initialized with at initdb time, along with any permissions set by extensions at CREATE EXTENSION time. This allows pg_dump (and any other similar use-cases) to detect when the privileges set on initdb-created or extension-created objects have been changed from what they were set to at initdb/extension-creation time and handle those changes appropriately. Reviews by Alexander Korotkov, Jose Luis Tallon http://git.postgresql.org/pg/commitdiff/6c268df1276e9dd73e4d2cc89cf8787e8f186bda
  • In pg_dump, use a bitmap to represent what to include. pg_dump has historically used a simple boolean 'dump' value to indicate if a given object should be included in the dump or not. Instead, use a bitmap which breaks down the components of an object into their distinct pieces and use that bitmap to only include the components requested. This does not include any behavioral change, but is in preperation for the change to dump out just ACLs for objects in pg_catalog. Reviews by Alexander Korotkov, Jose Luis Tallon http://git.postgresql.org/pg/commitdiff/a9f0e8e5a2e779a888988cb64479a6723f668c84
  • In pg_dump, include pg_catalog and extension ACLs, if changed. Now that all of the infrastructure exists, add in the ability to dump out the ACLs of the objects inside of pg_catalog or the ACLs for objects which are members of extensions, but only if they have been changed from their original values. The original values are tracked in pg_init_privs. When pg_dump'ing 9.6-and-above databases, we will dump out the ACLs for all objects in pg_catalog and the ACLs for all extension members, where the ACL has been changed from the original value which was set during either initdb or CREATE EXTENSION. This should not change dumps against pre-9.6 databases. Reviews by Alexander Korotkov, Jose Luis Tallon http://git.postgresql.org/pg/commitdiff/23f34fa4ba358671adab16773e79c17c92cbc870
  • In pg_dump, split "dump" into "dump" and "dump_contains". Historically, the "dump" component of the namespace has been used to decide if the objects inside of the namespace should be dumped also. Given that "dump" is now a bitmask and may be partial, and we may want to dump out all components of the namespace object but only some of the components of objects contained in the namespace, create a "dump_contains" bitmask which will represent what components of the objects inside of a namespace should be dumped out. No behavior change here, but in preparation for a change where we will dump out just the ACLs of objects in pg_catalog, but we might not dump out the ACL of the pg_catalog namespace itself (for instance, when it hasn't been changed from the value set at initdb time). Reviews by Alexander Korotkov, Jose Luis Tallon http://git.postgresql.org/pg/commitdiff/d217b2c360cb9a746b4ef122c568bdfedb6d726e
  • Bump catversion for pg_dump dump catalog ACL patches. Pointed out by Tom. http://git.postgresql.org/pg/commitdiff/29dd1504a12f324c75f6b5ce8863505e499633ec
  • Use GRANT system to manage access to sensitive functions. Now that pg_dump will properly dump out any ACL changes made to functions which exist in pg_catalog, switch to using the GRANT system to manage access to those functions. This means removing 'if (!superuser()) ereport()' checks from the functions themselves and then REVOKEing EXECUTE right from 'public' for these functions in system_views.sql. Reviews by Alexander Korotkov, Jose Luis Tallon http://git.postgresql.org/pg/commitdiff/1574783b4ced0356fbc626af1a1a469faa6b41e1
  • GRANT rights to CURRENT_USER instead of adding roles. We shouldn't be adding roles during the regression tests as that can cause back-to-back installcheck runs to fail and users running the regression tests likley don't want those extra roles. Pointed out by Tom http://git.postgresql.org/pg/commitdiff/6928484bda454f9ab2456d385b2d317f18b6bf1a
  • In dumpTable, re-instate the skipping logic. Pretty sure I removed this based on some incorrect thinking that it was no longer possible to reach this point for a table which will not be dumped, but that's clearly wrong. Pointed out on IRC by Erik Rijkers. http://git.postgresql.org/pg/commitdiff/689f9a058854a1a32e994818dd6d79f49d8f8a1b
  • Fix improper usage of 'dump' bitmap. Now that 'dump' is a bitmap, we can't simply set it to 'true'. Noticed while debugging the prior issue. http://git.postgresql.org/pg/commitdiff/fa6075e5515c6878b2c1fe1c6435dd7ed847857d
  • Create default roles. This creates an initial set of default roles which administrators may use to grant access to, historically, superuser-only functions. Using these roles instead of granting superuser access reduces the number of superuser roles required for a system. Documention for each of the default roles has been added to user-manag.sgml. Bump catversion to 201604082, as we had a commit that bumped it to 201604081 and another that set it back to 201604071... Reviews by José Luis Tallón and Robert Haas http://git.postgresql.org/pg/commitdiff/7a542700df25eaf97b794bff63606176433dcdda
  • Reserve the "pg_" namespace for roles. This will prevent users from creating roles which begin with "pg_" and will check for those roles before allowing an upgrade using pg_upgrade. This will allow for default roles to be provided at initdb time. Reviews by José Luis Tallón and Robert Haas http://git.postgresql.org/pg/commitdiff/293007898d3fa5a815c1c5814df53627553f114d

Noah Misch pushed:

  • Standardize GetTokenInformation() error reporting. Commit c22650cd6450854e1a75064b698d7dcbb4a8821a sparked a discussion about diverse interpretations of "token user" in error messages. Expel old and new specimens of that phrase by making all GetTokenInformation() callers report errors the way GetTokenUser() has been reporting them. These error conditions almost can't happen, so users are unlikely to observe this change. Reviewed by Tom Lane and Stephen Frost. http://git.postgresql.org/pg/commitdiff/f2b1b3079ce9d2965f6e450585f24d18cdf5647b
  • Remove redundant message in AddUserToTokenDacl(). GetTokenUser() will have reported an adequate error message. These error conditions almost can't happen, so users are unlikely to observe this change. Reviewed by Tom Lane and Stephen Frost. http://git.postgresql.org/pg/commitdiff/33d3fc5e2aac32fcf356c09cee4bfded6613a1f3

Kevin Grittner pushed:

  • Detect SSI conflicts before reporting constraint violations. While prior to this patch the user-visible effect on the database of any set of successfully committed serializable transactions was always consistent with some one-at-a-time order of execution of those transactions, the presence of declarative constraints could allow errors to occur which were not possible in any such ordering, and developers had no good workarounds to prevent user-facing errors where they were not necessary or desired. This patch adds a check for serialization failure ahead of duplicate key checking so that if a developer explicitly (redundantly) checks for the pre-existing value they will get the desired serialization failure where the problem is caused by a concurrent serializable transaction; otherwise they will get a duplicate key error. While it would be better if the reads performed by the constraints could count as part of the work of the transaction for serialization failure checking, and we will hopefully get there some day, this patch allows a clean and reliable way for developers to work around the issue. In many cases existing code will already be doing the right thing for this to "just work". Author: Thomas Munro, with minor editing of docs by me Reviewed-by: Marko Tiikkaja, Kevin Grittner http://git.postgresql.org/pg/commitdiff/fcff8a575198478023ada8a48e13b50f70054766
  • Modify BufferGetPage() to prepare for "snapshot too old" feature. This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made. http://git.postgresql.org/pg/commitdiff/8b65cf4c5edabdcae45ceaef7b9ac236879aae50
  • Add snapshot_too_old to NSVC @contrib_excludes. The buildfarm showed failure for Windows MSVC builds due to this omission. This might not be the only problem with the Makefile for this feature, but hopefully this will get it past the immediate problem. Fix suggested by Tom Lane http://git.postgresql.org/pg/commitdiff/279d86afdbed550425bc9d1327ade2dc0028ad33
  • Fix typo in C comment. http://git.postgresql.org/pg/commitdiff/381200be4b565292eba6f62200248cb775f06940
  • Turn special page pointer validation to static inline function. Inclusion of multiple macros inside another macro was pushing MSVC past its size liimit. Reported by buildfarm. http://git.postgresql.org/pg/commitdiff/56dffb5a73ab157fc8d35a76c1170d656a051f14
  • Add the "snapshot too old" feature. This feature is controlled by a new old_snapshot_threshold GUC. A value of -1 disables the feature, and that is the default. The value of 0 is just intended for testing. Above that it is the number of minutes a snapshot can reach before pruning and vacuum are allowed to remove dead tuples which the snapshot would otherwise protect. The xmin associated with a transaction ID does still protect dead tuples. A connection which is using an "old" snapshot does not get an error unless it accesses a page modified recently enough that it might not be able to produce accurate results. This is similar to the Oracle feature, and we use the same SQLSTATE and error message for compatibility. http://git.postgresql.org/pg/commitdiff/848ef42bb8c7909c9d7baa38178d4a209906e7c1

Andres Freund pushed:

  • Increase maximum number of clog buffers. Benchmarking has shown that the current number of clog buffers limits scalability. We've previously increased the number in 33aaa139, but that's not sufficient with a large number of clients. We've benchmarked the cost of increasing the limit by benchmarking worst case scenarios; testing showed that 128 buffers don't cause a regression, even in contrived scenarios, whereas 256 does There are a number of more complex patches flying around to address various clog scalability problems, but this is simple enough that we can get it into 9.6; and is beneficial even after those patches have been applied. It is a bit unsatisfactory to increase this in small steps every few releases, but a better solution seems to require a rewrite of slru.c; not something done quickly. Author: Amit Kapila and Andres Freund Discussion: CAA4eK1+-=18HOrdqtLXqOMwZDbC_15WTyHiFruz7BvVArZPaAw@mail.gmail.com http://git.postgresql.org/pg/commitdiff/5364b357fb115ed4dc7174085d8f59d9425638dd
  • Expose more out/readfuncs support functions. Previously bcac23d exposed a subset of support functions, namely the ones Kaigai found useful. In 20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de I mentioned that there's some functions missing to use the facility in an external project. To avoid having to add functions piecemeal, add all the functions which are used to define READ_* and WRITE_* macros; users of the extensible node functionality are likely to need these. Additionally expose outDatum(), which doesn't have it's own WRITE_ macro, as it needs information from the embedding struct. Discussion: 20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de http://git.postgresql.org/pg/commitdiff/c1ddd2361f6eb071d51b856c697a4aab22f8c776
  • Avoid the use of a separate spinlock to protect a LWLock's wait queue. Previously we used a spinlock, in adition to the atomically manipulated ->state field, to protect the wait queue. But it's pretty simple to instead perform the locking using a flag in state. Due to 6150a1b0 BufferDescs, on platforms (like PPC) with > 1 byte spinlocks, increased their size above 64byte. As 64 bytes are the size we pad allocated BufferDescs to, this can increase false sharing; causing performance problems in turn. Together with the previous commit this reduces the size to <= 64 bytes on all common platforms. Author: Andres Freund Discussion: CAA4eK1+ZeB8PMwwktf+3bRS0Pt4Ux6Rs6Aom0uip8c6shJWmyg@mail.gmail.com 20160327121858.zrmrjegmji2ymnvr@alap3.anarazel.de http://git.postgresql.org/pg/commitdiff/008608b9d51061b1f598c197477b3dc7be9c4a64
  • Allow Pin/UnpinBuffer to operate in a lockfree manner. Pinning/Unpinning a buffer is a very frequent operation; especially in read-mostly cache resident workloads. Benchmarking shows that in various scenarios the spinlock protecting a buffer header's state becomes a significant bottleneck. The problem can be reproduced with pgbench -S on larger machines, but can be considerably worse for queries which touch the same buffers over and over at a high frequency (e.g. nested loops over a small inner table). To allow atomic operations to be used, cram BufferDesc's flags, usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable; that allows to manipulate them together using 32bit compare-and-swap operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could be lifted by using a 64bit field, but it's not a realistic configuration atm). As not all operations can easily implemented in a lockfree manner, implement the previous buf_hdr_lock via a flag bit in the atomic variable. That way we can continue to lock the header in places where it's needed, but can get away without acquiring it in the more frequent hot-paths. There's some additional operations which can be done without the lock, but aren't in this patch; but the most important places are covered. As bufmgr.c now essentially re-implements spinlocks, abstract the delay logic from s_lock.c into something more generic. It now has already two users, and more are coming up; there's a follupw patch for lwlock.c at least. This patch is based on a proof-of-concept written by me, which Alexander Korotkov made into a fully working patch; the committed version is again revised by me. Benchmarking and testing has, amongst others, been provided by Dilip Kumar, Alexander Korotkov, Robert Haas. On a large x86 system improvements for readonly pgbench, with a high client count, of a factor of 8 have been observed. Author: Alexander Korotkov and Andres Freund Discussion: 2400449.GjM57CE0Yg@dinodell http://git.postgresql.org/pg/commitdiff/48354581a49c30f5757c203415aa8412d85b0f70

Andrew Dunstan pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Kyotaro HORIGUCHI sent in two more revisions of a patch to add tab completion for IF [NOT] EXISTS to psql.

Emre Hasegeli sent in a patch to change "magick" to "magic."

Artur Zakirov sent in a patch to fix a typo in the documentation of indexam.

Aleksander Alekseev sent in two revisions of a patch to fix an issue in how sigmask works.

David Steele sent in two more revisions of a patch to filter messages including errors that might expose information needlessly.

Fabrízio de Royes Mello sent in two more revisions of a patch to add a sequence access method.

Amit Langote and Kyotaro HORIGUCHI traded patches to fix an issue where altering a foreign table was failing to invalidate prepared statement execution plan that depended on its previous state.

Robbie Harwood sent in another revision of a patch to implement GSSAPI encryption.

Etsuro Fujita sent in another revision of a patch to handle odd system-column handling in postgres_fdw join pushdown.

Etsuro Fujita and Rushabh Lathia traded patches to optimize writes to the postgres fdw.

Rod Taylor sent in a patch to implement LOCK TABLE .. DEFERRABLE.

Craig Ringer sent in a patch to add some tests to pg_xlogdump.

Craig Ringer sent in a patch to fix incorrect comments introduced in logical decoding timeline following.

Anastasia Lubennikova sent in five more revisions of a patch to add covering unique indexes.

Pavel Stěhule sent in another revision of a patch to create a RAW output format for COPY.

Aleksander Alekseev sent in a patch to simplify reorderbuffer.c

Simon Riggs sent in another revision of a patch to avoid archiving XLOG_RUNNING_XACTS on idle server.

Muhammad Asif Naeem sent in a patch to add an EMERGENCY option to VACUUM that forces to avoid extend any entries in the VM or FSM.

David Rowley and Tom Lane traded patches to improve performance of outer joins when the outer side is unique.

Fabien COELHO sent in a patch to allow seeding randomness in pgbench.

WANGSHUO sent in a patch to allow UPDATE to operate on column aliases.

Michaël Paquier sent in another revision of a patch to add support for VS 2015 in MSVC scripts.

Michaël Paquier sent in a patch to fix parallel pg_dump on Win32.

Aleksander Alekseev sent in a patch to fix an issue in snapmgr.c that wrongly assumes that subxcnt > 0 iff xcnt > 0.

Anastasia Lubennikova sent in a patch to make the amcheck tool work with covering unique indexes.

Constantin S. Pan sent in another revision of a patch to speed up GIN index builds with parallel workers.

Stephen Frost sent in a patch to fix breakage of pg_dump caused by the covering unique indexes patch.

Stephen Frost sent in a patch to remove superuser checks in pgstattuple 1.4.

David Fetter sent in two more revisions of a patch to implement weighted central moments as aggregates.

Stas Kelvich sent in another revision of a patch to speed up 2PC transactions.

Jeff Janes sent in a patch to add tab completion for ALTER EXTENSION to psql.

David Rowley sent in a patch to make parallel aggregate costs consider combine/serial/deserial functions.

Stephen Frost sent in a patch to add regression tests for CREATE ROLE/USAGE.

Andres Freund sent in a patch to disable pymalloc when running under valgrind.

par N Bougain le vendredi 15 avril 2016 à 20h42

mardi 12 avril 2016

Damien Clochard

Superviser PostgreSQL

Il y a quelques jours j’ai participé au PG Day Paris 2016 en présentant un rapide tour d’horizon des solutions de supervision pour PostgreSQL. L’occasion de présenter l’état de l’art de l’écosystème Postgres en matière d’outil de visualisation !

Les slides de ma présentation sont disponibles ci-dessous :

Ma présentation s’est plutot bien déroulée et si j’en juge par les retours que j’ai eu le sujet intéresse beaucoup de mondes.

Plusieurs personnes sont venues me voir ensuite avec des remarques… Je profite de cet article pour y répondre globalement :

1- C’était trop court !

Oui 25 minutes pour parler d’une cinquantaine de logiciels, c’est forcément un peu lapidaire :) En même temps, ce format me convenait parfaitement car le but de la présentation était de donner des pistes et surtout une méthodologie pour évaluer les différentes solutions qui s’offrent aux DBA PostgreSQL.

2- Pourquoi est-ce que vous n’avez pas cité ELK parmi les solutions d’analyse de logs ?*

Il s’agit d’un demi-oubli…. Certes le trio ElasticSearch + Logstash + Kibana est très prometteur mais c’est aussi un pile applicative assez complexe à installer et à maintenir. Ma présentation était destiné à des DBA et ne traitait pas du monitoring en général. Et j’ai la convinction que si l’objectif est d’analyser uniquement les logs de PostgreSQL alors une solution comme pgBadger est 100 plus simple et efficiante que de monter un pile ELK from scratch. A contrario, si vous êtes DBA et qu’une chaine ELK est déjà en place, alors il peut être intéressant d’y injecter vos logs PostgreSQL. Voici un bon point de départ si le sujet vous intéresse : http://blog.2ndquadrant.com/redislog-integrating-postgresql-with-logstash-for-devops-real-time-monitoring/

3- Pourquoi est-ce que vous n’avez pas parlé de l’impact négatif du monitoring sur les perfs ?

Tout système de supervision fera pression sur votre instance PostgreSQL et impactera ses performances. Je n’ai pas abordé ce sujet volontairement car je déteste présenter des résultats de benchmarks dans une conférence. Afficher des chiffres et des graphes pendant quelques secondes sans pouvoir donner tous les éléments de contexte est totalement contre-prodcutif et on court toujours le risque que les auditeurs prennent des résultats ponctuels pour de règles absolues. Pour moi, le message important ce n’est pas de dire “le logiciel X aura moins d’impact négatif que le logiciel Y” , mais au contraire “Faites vos propres benchmarks et sélectionner l’outil le plus adapté à votre trafic et à votre métier. Pour savoir comment réaliser un benchmarch, vous pouvez consulter le récit du match PoWA vs The Badger réalisé en 2014.

Au final, le PG Day Paris était une réussite et je remercie tous les organisateurs pour cet événement.

Prochain Rendez-vous : le PG Day France à Lille le 31 mai! Cette fois j’y serai en tant qu’organisateur.

par Damien Clochard le mardi 12 avril 2016 à 22h52

lundi 4 avril 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 3 avril 2016

Les mises à jour de sécurité 9.5.2, 9.4.7, 9.3.12, 9.2.16 et 9.1.21 ont été publiées. Mettez à jour dès que possible ! http://www.postgresql.org/about/news/1656/

Le premier meetup du PUG islamabadien (Pakistan) aura lieu le 8 avril. Détails et RSVP ci-après : http://www.meetup.com/Islamabad-PostgreSQL-User-Group/events/229935189/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en avril

PostgreSQL Local

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160404000514.GD24186@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Andres Freund pushed:

  • pg_rewind: Close backup_label file descriptor. This was a relatively harmless leak, as createBackupLabel() is only called once per pg_rewind invocation. Author: Michael Paquier Reported-By: Michael Paquier Discussion: CAB7nPqRnOw30gOXe2_SPLjh37bgm4V+txbYAPwoXb97nGQ297w@mail.gmail.com Backpatch: 9.5, where pg_rewind was introduced http://git.postgresql.org/pg/commitdiff/a6c845946dac5c1f26cf8729cf61f1d852f75484
  • Fix LWLockReportWaitEnd() parameter list to be (void). Previously it was an "old style" function declaration. http://git.postgresql.org/pg/commitdiff/9f7c527af308dcdaba2f0ff9d362d672e8886fb1
  • pg_rewind: fsync target data directory. Previously pg_rewind did not fsync any files. That's problematic, given that the target directory is modified. If the database was started afterwards, 2ce439f33 luckily already caused the data directory to be synced to disk at postmaster startup; reducing the scope of the problem. To fix, use initdb -S, at the end of the pg_rewind run. It doesn't seem worthwhile to duplicate the code into pg_rewind, and initdb -S is already used that way by pg_upgrade. Reported-By: Andres Freund Author: Michael Paquier, somewhat edited by me Discussion: 20160310034352.iuqgvpmg5qmnxtkz@alap3.anarazel.de CAB7nPqSytVG1o4S3S2pA1O=692ekurJ+fckW2PywEG3sNw54Ow@mail.gmail.com Backpatch: 9.5, where pg_rewind was introduced http://git.postgresql.org/pg/commitdiff/408f0438531eec17ac62f91fc23f72bcfc48dd36

Tom Lane pushed:

  • Clamp adjusted ndistinct to positive integer in estimate_hash_bucketsize(). This avoids a possible divide-by-zero in the following calculation, and rounding the number to an integer seems like saner behavior anyway. Assuming IEEE math, the division would yield +Infinity which would get replaced by 1.0 at the bottom of the function, so nothing really interesting would ensue; but avoiding divide-by-zero seems like a good idea on general principles. Per report from Piotr Stefaniak. No back-patch since this seems mostly cosmetic. http://git.postgresql.org/pg/commitdiff/fa09f8935156533584b4e215bdf70ec1ff968dad
  • Guard against zero vardata.rel->tuples in estimate_hash_bucketsize(). If the referenced rel was proven empty, we'd compute 0/0 here, which results in the function returning NaN. That's a bit more serious than the other zero-divide case. Still, it only seems to be possible in HEAD, so no back-patch. Per report from Piotr Stefaniak. I looked through the rest of selfuncs.c and found no other likely trouble spots. http://git.postgresql.org/pg/commitdiff/d65b665d524a67273b075f468bf3d60ce31f4040
  • Release notes for 9.5.2, 9.4.7, 9.3.12, 9.2.16, 9.1.21. http://git.postgresql.org/pg/commitdiff/499a50571c72f41bb1365970d55dae5c8afcb6ba
  • Code and docs review for commit 3187d6de0e5a9e805b27c48437897e8c39071d45. Fix up check for high-bit-set characters, which provoked "comparison is always true due to limited range of data type" warnings on some compilers, and was unlike the way we do it elsewhere anyway. Fix omission of "$" from the set of valid identifier continuation characters. Get rid of sanitize_text(), which was utterly inconsistent with any other error report anywhere in the system, and wasn't even well designed on its own terms (double-quoting the result string without escaping contained double quotes doesn't seem very well thought out). Fix up error messages, which didn't follow the message style guidelines very well, and were overly specific in situations where the actual mistake might not be what they said. Improve documentation. (I started out just intending to fix the compiler warning, but the more I looked at the patch the less I liked it.) http://git.postgresql.org/pg/commitdiff/d12e5bb79bb535c2df13b76cd7d01f0bb8dc8e4d
  • Document errhidecontext() where it ought to be documented. Seems to have been missed when this function was added. Noted while looking at David Steele's proposal to add another similar function. http://git.postgresql.org/pg/commitdiff/e5a4dea80f2506a7a565508e48aaa52296ff410a
  • Sync our copy of the timezone library with IANA release tzcode2016c. We hadn't done this in about six years, which proves to have been a mistake because there's been a lot of code churn upstream, making the merge rather painful. But putting it off any further isn't going to lessen the pain, and there are at least two incompatible changes that we need to absorb before someone starts complaining that --with-system-tzdata doesn't work at all on their platform, or we get blindsided by a tzdata release that our out-of-date zic can't compile. Last week's "time zone abbreviation differs from POSIX standard" mess was a wake-up call in that regard. This is a sufficiently large patch that I'm afraid to back-patch it immediately, though the foregoing considerations imply that we probably should do so eventually. For the moment, just put it in HEAD so that it can get some testing. Maybe we can wait till the end of the 9.6 beta cycle before deeming it okay. http://git.postgresql.org/pg/commitdiff/1c1a7cbd6a1600c97dfcd9b5dc78a23b5db9bbf6
  • Fix MSVC build for changes in zic. zic now only needs zic.c, but I didn't realize knowledge about it was hardwired into Mkvcbuild.pm. Per buildfarm. http://git.postgresql.org/pg/commitdiff/f5f15ea6aad1b75c1c133a914cf29f9831089a6e
  • Sync tzload() and tzparse() APIs with IANA release tzcode2016c. This brings us a bit closer to matching upstream, but since it affects files outside src/timezone/, we might choose not to back-patch it. Hence keep it separate from the main update patch. http://git.postgresql.org/pg/commitdiff/1f4e9da624a0caf78bcb526f6b05f5993e26f2c7
  • Fix portability issues in 86c43f4e22c0771fd0cc6bce2799802c894ee2ec. INT64_MIN/MAX should be spelled PG_INT64_MIN/MAX, per well established convention in our sources. Less obviously, a symbol named DOUBLE causes problems on Windows builds, so rename that to DOUBLE_CONST; and rename INTEGER to INTEGER_CONST for consistency. Also, get rid of incorrect/obsolete hand-munging of yycolumn, and fix the grammar for float constants to handle expected cases such as ".1". First two items by Michael Paquier, second two by me. http://git.postgresql.org/pg/commitdiff/656ee8489053aafc85324b9ef7e91b645674ffb9
  • Fix zic for Windows. The new coding of dolink() is dependent on link() returning an on-point errno when it fails; but the quick-hack implementation of link() that we'd put in for Windows didn't bother with setting errno. Fix that. Analysis and patch by Christian Ullrich. http://git.postgresql.org/pg/commitdiff/6d257e732b358ee601a114fe3d1640a46317e554
  • Protect zic's symlink() call with #ifdef HAVE_SYMLINK. The IANA crew seem to think that symlink() exists everywhere nowadays, and they may well be right. But we use #ifdef HAVE_SYMLINK elsewhere so for consistency we should do it here too. Noted by Michael Paquier. http://git.postgresql.org/pg/commitdiff/534da37927f97ae7cb1b468963ba9bca747209ea
  • Avoid possibly-unsafe use of Windows' FormatMessage() function. Whenever this function is used with the FORMAT_MESSAGE_FROM_SYSTEM flag, it's good practice to include FORMAT_MESSAGE_IGNORE_INSERTS as well. Otherwise, if the message contains any %n insertion markers, the function will try to fetch argument strings to substitute --- which we are not passing, possibly leading to a crash. This is exactly analogous to the rule about not giving printf() a format string you're not in control of. Noted and patched by Christian Ullrich. Back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/7abc1571652a924ba4258bda0a26df2de03b790e
  • Allow to_timestamp(float8) to convert float infinity to timestamp infinity. With the original SQL-function implementation, such cases failed because we don't support infinite intervals. Converting the function to C lets us bypass the interval representation, which should be a bit faster as well as more flexible. Vitaly Burovoy, reviewed by Anastasia Lubennikova http://git.postgresql.org/pg/commitdiff/e511d878f3bbc205cd260a79740e646eea3c1cd3
  • Fix interval_mul() to not produce insane results. interval_mul() attempts to prevent its calculations from producing silly results, but it forgot that zero times infinity yields NaN in IEEE arithmetic. Hence, a case like '1 second'::interval * 'infinity'::float8 produced a NaN for the months product, which didn't trigger the range check, resulting in bogus and possibly platform-dependent output. This isn't terribly obvious to the naked eye because if you try that exact case, you get "interval out of range" which is what you expect --- but if you look closer, the error is coming from interval_out not interval_mul. interval_mul has allowed a bogus value into the system. Fix by adding isnan tests. Noted while testing Vitaly Burovoy's fix for infinity input to to_timestamp(). Given the lack of field complaints, I doubt this is worth a back-patch. http://git.postgresql.org/pg/commitdiff/a898b409f66f956e99694710f537829db02652c0
  • Remove TZ environment-variable entry from postgres reference page. The server hasn't paid attention to the TZ environment variable since commit ca4af308c32d03db, but that commit missed removing this documentation reference, as did commit d883b916a947a3c6 which added the reference where it now belongs (initdb). Back-patch to 9.2 where the behavior changed. Also back-patch d883b916a947a3c6 as needed. Matthew Somerville http://git.postgresql.org/pg/commitdiff/c3834ef9e8abaca54ae542eac960f96e9fecc9a8
  • Remove just-added tests for to_timestamp(float8) with out-of-range inputs. Reporting the specific out-of-range input value produces platform-dependent results. We could skip reporting the value, but that's contrary to our message style guidelines and unhelpful to users. Or we could add a separate expected-output file for Windows, but that would be a substantial maintenance burden, and these test cases seem unlikely to be worth it. Per buildfarm. http://git.postgresql.org/pg/commitdiff/c53ab8a3af46029b72634ec0643e78661b252f62
  • Suppress uninitialized-variable warnings. My compiler doesn't like the lack of initialization of "flag", and I think it's right: if there were zero keys we'd have an undefined result. The AND of zero items is TRUE, so initialize to TRUE. http://git.postgresql.org/pg/commitdiff/818e59373625d194bdec89631b661c4355d15f13
  • Improve portability of I/O behavior for the geometric types. Formerly, the geometric I/O routines such as box_in and point_out relied directly on strtod() and sprintf() for conversion of the float8 component values of their data types. However, the behavior of those functions is pretty platform-dependent, especially for edge-case values such as infinities and NaNs. This was exposed by commit acdf2a8b372aec1d, which added test cases involving boxes with infinity endpoints, and immediately failed on Windows and AIX buildfarm members. We solved these problems years ago in the main float8in and float8out functions, so let's fix it by making the geometric types use that code instead of depending directly on the platform-supplied functions. To do this, refactor the float8in code so that it can be used to parse just part of a string, and as a convenience make the guts of float8out usable without going through DirectFunctionCall. While at it, get rid of geo_ops.c's fairly shaky assumptions about the maximum output string length for a double, by having it build results in StringInfo buffers instead of fixed-length strings. In passing, convert all the "invalid input syntax for type foo" messages in this area of the code into "invalid input syntax for type %s" to reduce the number of distinct translatable strings, per recent discussion. We would have needed a fair number of the latter anyway for code-sharing reasons, so we might as well just go whole hog. Note: this patch is by no means intended to guarantee that the geometric types uniformly behave sanely for infinity or NaN component values. But any bugs we have in that line were there all along, they were just harder to reach in a platform-independent way. http://git.postgresql.org/pg/commitdiff/50861cd683e86d5ef2dc1cb669fb503225e4eb98
  • Last-minute updates for release notes. Security: CVE-2016-2193, CVE-2016-3065 http://git.postgresql.org/pg/commitdiff/4c46f83386a7e3556856d1e4c9f0c294d16b0dcc
  • Support using index-only scans with partial indexes in more cases. Previously, the planner would reject an index-only scan if any restriction clause for its table used a column not available from the index, even if that restriction clause would later be dropped from the plan entirely because it's implied by the index's predicate. This is a fairly common situation for partial indexes because predicates using columns not included in the index are often the most useful kind of predicate, and we have to duplicate (or at least imply) the predicate in the WHERE clause in order to get the index to be considered at all. So index-only scans were essentially unavailable with such partial indexes. To fix, we have to do detection of implied-by-predicate clauses much earlier in the planner. This patch puts it in check_index_predicates (nee check_partial_indexes), meaning it gets done for every partial index, whereas we previously only considered this issue at createplan time, so that the work was only done for an index actually selected for use. That could result in a noticeable planning slowdown for queries against tables with many partial indexes. However, testing suggested that there isn't really a significant cost, especially not with reasonable numbers of partial indexes. We do get a small additional benefit, which is that cost_index is more accurate since it correctly discounts the evaluation cost of clauses that will be removed. We can also avoid considering such clauses as potential indexquals, which saves useless matching cycles in the case where the predicate columns aren't in the index, and prevents generating bogus plans that double-count the clause's selectivity when the columns are in the index. Tomas Vondra and Kyotaro Horiguchi, reviewed by Kevin Grittner and Konstantin Knizhnik, and whacked around a little by me http://git.postgresql.org/pg/commitdiff/f9aefcb91fc1f73fc43e384f660c120e515af931
  • Another zic portability fix. I should have remembered that we can't use INT64_MODIFIER with sscanf(): configure chooses that to work with snprintf(), but it might be for our src/port/snprintf.c implementation and so not compatible with the platform's sscanf(). This appears to be the explanation for buildfarm member frogmouth's continuing unhappiness with the tzcode update. Fortunately, in all of the places where zic is attempting to read into an int64 variable, it's reading a year which certainly will fit just fine into an int. So make it read into an int with %d, and then cast or copy as necessary. http://git.postgresql.org/pg/commitdiff/c202ecf9023ac3571709c274b326038ae39e90a7
  • Fix oversight in getParamDescriptions(), and improve comments. When getParamDescriptions was changed to handle out-of-memory better by cribbing error recovery logic from getRowDescriptions/getAnotherTuple, somebody omitted to copy the stanza about checking for excess data in the message. But you need to do that, since continue'ing out of the switch in pqParseInput3 means no such check gets applied there anymore. Noted while looking at Michael Paquier's patch that made yet another copy of this advance_and_error logic. (This whole business desperately needs refactoring, because I sure don't want to see a dozen copies of this code, but that's where we seem to be headed. What's more, the "suspend parsing on EOF return" convention is a holdover from protocol 2 and shouldn't exist at all in protocol 3, because we don't process partial messages anymore. But for now, just fix the obvious bug.) Also, fix some wrong/missing comments about what the API spec is for these three functions. This doesn't seem worthy of back-patching, even though it's a bug; the case shouldn't ever arise in the field. http://git.postgresql.org/pg/commitdiff/2306696004dc6b9259a45e76522c01d6ee5d2ee7
  • Get rid of minus zero in box regression test. Commit acdf2a8b added a test case involving minus zero as a box endpoint. This is not very portable, as evidenced by the several older buildfarm members that are failing on the test because they print minus zero as just "0". If there were any significant reason to test this behavior, we could consider carrying a separate expected-file; but it doesn't look to me like there's adequate justification to accept such a maintenance burden. Just change the test to use plain zero, instead. http://git.postgresql.org/pg/commitdiff/a067b50470cf7fda77dd28b03519f2483c2322bf
  • Omit null rows when applying the Haas-Stokes estimator for ndistinct. Previously, we included null rows in the values of n and N that went into the formula, which amounts to considering null as a value in its own right; but the d and f1 values do not include nulls. This is inconsistent, and it contributes to significant underestimation of ndistinct when the column is mostly nulls. In any case stadistinct is defined as the number of distinct non-null values, so we should exclude nulls when doing this computation. This is an aboriginal bug in our application of the Haas-Stokes formula, but we'll refrain from back-patching for fear of destabilizing plan choices in released branches. While at it, make the code a bit more readable by omitting unnecessary casts and intermediate variables. Observation and original patch by Tomas Vondra, adjusted to fix both uses of the formula by Alex Shulgin, cosmetic improvements by me http://git.postgresql.org/pg/commitdiff/be4b4dc75955318e763f5b2e3a990e35366ac797
  • Omit null rows when setting the threshold for what's a most-common value. As with the previous patch, large numbers of null rows could skew this calculation unfavorably, causing us to discard values that have a legitimate claim to be MCVs, since our definition of MCV is that it's most common among the non-null population of the column. Hence, make the numerator of avgcount be the number of non-null sample values not the number of sample rows; likewise for maxmincount in the compute_scalar_stats variant. Also, make the denominator be the number of distinct values actually observed in the sample, rather than reversing it back out of the computed stadistinct. This avoids depending on the accuracy of the Haas-Stokes approximation, and really it's what we want anyway; the threshold should depend only on what we see in the sample, not on what we extrapolate about the contents of the whole column. Alex Shulgin, reviewed by Tomas Vondra and myself http://git.postgresql.org/pg/commitdiff/3d3bf62f30200500637b24fdb7b992a99f9704c3
  • Suppress compiler warning. Some buildfarm members are showing "comparison is always false due to limited range of data type" complaints on this test, so #ifdef it out on machines with 32-bit int. http://git.postgresql.org/pg/commitdiff/45aae8e78967b37f285e99617b919319bf2bf536
  • Add missing "static". Per buildfarm member pademelon. http://git.postgresql.org/pg/commitdiff/5a5b917184b630529635db2e037d298ad90c355d
  • Make all the declarations of WaitEventSetWaitBlock be marked "inline". The inconsistency here triggered compiler warnings on some buildfarm members, and it's surely pretty pointless. http://git.postgresql.org/pg/commitdiff/a1953f3a60cc7d1b8516d0b2c7e82ae8e9242de3
  • Add psql \errverbose command to see last server error at full verbosity. Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds a psql feature that redisplays the most recent server error at full verbosity, without needing to make any variable changes or re-execute the failed command. We just need to hang onto the latest error PGresult in case the user executes \errverbose, and then apply libpq's new PQresultVerboseErrorMessage() function to it. This will consume some trivial amount of psql memory, but otherwise the cost when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me http://git.postgresql.org/pg/commitdiff/3cc38ca7d21255721d600eb75d7cc6708c14764b
  • Add libpq support for recreating an error message with different verbosity. Often, upon getting an unexpected error in psql, one's first wish is that the verbosity setting had been higher; for example, to be able to see the schema-name field or the server code location info. Up to now the only way has been to adjust the VERBOSITY variable and repeat the failing query. That's a pain, and it doesn't work if the error isn't reproducible. This commit adds support in libpq for regenerating the error message for an existing error PGresult at any desired verbosity level. This is almost just a matter of refactoring the existing code into a subroutine, but there is one bit of possibly-needed information that was not getting put into PGresults: the text of the last query sent to the server. We must add that string to the contents of an error PGresult. But we only need to save it if it might be used, which with the existing error-formatting code only happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is probably pretty rare for errors in production situations. So really the overhead when the feature isn't used should be negligible. Alex Shulgin, reviewed by Daniel Vérité, some improvements by me http://git.postgresql.org/pg/commitdiff/e3161b231cfaadd4b6438eff2fc1f6cd086f41a9
  • Clean up some stuff in new contrib/bloom module. Coverity complained about implicit sign-extension in the BloomPageGetFreeSpace macro, probably because sizeOfBloomTuple isn't wide enough for size calculations. No overflow is really possible as long as maxoff and sizeOfBloomTuple are small enough to represent a realistic situation, but it seems like a good idea to declare sizeOfBloomTuple as Size not int32. Add missing check on BloomPageAddItem() result, again from Coverity. Avoid core dump due to not allocating so->sign array when scan->numberOfKeys is zero. Also thanks to Coverity. Use FLEXIBLE_ARRAY_MEMBER rather than declaring an array as size 1 when it isn't necessarily. Very minor beautification of related code. Unfortunately, none of the Coverity-detected mistakes look like they could account for the remaining buildfarm unhappiness with this module. It's barely possible that the FLEXIBLE_ARRAY_MEMBER mistake does account for that, if it's enabling bogus compiler optimizations; but I'm not terribly optimistic. We probably still have bugs to find here. http://git.postgresql.org/pg/commitdiff/a9284849b48b04fa2836aaf704659974c13e610d
  • Fix contrib/bloom to not fail under CLOBBER_CACHE_ALWAYS. The code was supposing that rd_amcache wouldn't disappear from under it during a scan; which is wrong. Copy the data out of the relcache rather than trying to reference it there. http://git.postgresql.org/pg/commitdiff/8f75fd1f402acbc30bc15dbf51eb6dec1bbec600

Teodor Sigaev pushed:

Ãlvaro Herrera pushed:

  • Fix minor leak in pg_dump for ACCESS METHOD. Bug reported by Coverity. Author: Michaël Paquier http://git.postgresql.org/pg/commitdiff/37732a2555f109b09b7eedfc44a9de22e22268a4
  • pg_rewind: Improve internationalization. This is mostly cosmetic since two of the three changes are debug messages, and the third one is just a progress indicator. Author: Michaël Paquier http://git.postgresql.org/pg/commitdiff/cad3edef4f51c37c4b4d8667a2e76a81ca53f9e7
  • Update expected file from quoting change. I neglected to update this in 59a2111b23f. Per buildfarm http://git.postgresql.org/pg/commitdiff/4b746f0d07c762cf4b421b59a14dfd053eda1761
  • Mention BRIN as able to do multi-column indexes. Documentation mentioned B-tree, GiST and GIN as able to do multicolumn indexes; I failed to add BRIN to the list. Author: Petr Jediný Reviewed-By: Fujii Masao, Emre Hasegeli http://git.postgresql.org/pg/commitdiff/80b986cf528c912f4efc2b0e9f03611f0f15f4df
  • PostgresNode: initialize $timed_out if passed. Corrects an oversight in 2c83f435a3 where the $timed_out reference var isn't initialized; using it would require the caller to initialize it beforehand, which is cumbersome. Author: Craig Ringer http://git.postgresql.org/pg/commitdiff/9bd61311bd48ea53b18bfecb9adddfd844acbffa
  • pgbench: allow a script weight of zero. This refines the previous weight range and allows a script to be "turned off" by passing a zero weight, which is useful when scripting multiple pgbench runs. I did not apply the suggested warning when a script uses zero weight; we use the principle elsewhere that if there's nothing to be done, do nothing quietly. Adjust docs accordingly. Author: Jeff Janes, Fabien Coelho http://git.postgresql.org/pg/commitdiff/a1c935d3b71e44ba36530d47c3ccab6cc9b9eafe
  • I forgot the alternate expected file in previous commit. Without this, the test_slot_timelines modules fails "make installcheck" because the required feature is not enabled in a stock server. Per buildfarm http://git.postgresql.org/pg/commitdiff/3a3b309041b0f30066f0b6cb6640563b6ea27cde
  • Blind attempt at fixing Win32 issue on 24c5f1a103c. As best as I can tell, MyReplicationSlot needs to be PGDLLIMPORT in order for the new test_slot_timelines test module to compile. Per buildfarm http://git.postgresql.org/pg/commitdiff/3dd0792ae014c9ceb2c2ade43d0a3132cfeb4fc5
  • Fix broken variable declaration Author: Konstantin Knizhnik http://git.postgresql.org/pg/commitdiff/3501f71c21e31b275b7816551b06a666d9c0c9c9
  • Add missing checks to some of pageinspect's BRIN functions. brin_page_type() and brin_metapage_info() did not enforce being called by superuser, like other pageinspect functions that take bytea do. Since they don't verify the passed page thoroughly, it is possible to use them to read the server memory with a carefully crafted bytea value, up to a file kilobytes from where the input bytea is located. Have them throw errors if called by a non-superuser. Report and initial patch: Andreas Seltenreich Security: CVE-2016-3065 http://git.postgresql.org/pg/commitdiff/3e1338475ffc2eac25de60a9de9ce689b763aced
  • Fix recovery_min_apply_delay test. Previously this test was relying too much on WAL replay to occur in the exact configured interval, which was unreliable on slow or overly busy servers. Use a custom loop instead of poll_query_until, which is hopefully more reliable. Per continued failures on buildfarm member hamster (which is probably the only one running this test suite) Author: Michaël Paquier http://git.postgresql.org/pg/commitdiff/61608d38361f911a741d4a7df63afe3c7221437e
  • Enable logical slots to follow timeline switches. When decoding from a logical slot, it's necessary for xlog reading to be able to read xlog from historical (i.e. not current) timelines; otherwise, decoding fails after failover, because the archives are in the historical timeline. This is required to make "failover logical slots" possible; it currently has no other use, although theoretically it could be used by an extension that creates a slot on a standby and continues to replay from the slot when the standby is promoted. This commit includes a module in src/test/modules with functions to manipulate the slots (which is not otherwise possible in SQL code) in order to enable testing, and a new test in src/test/recovery to ensure that the behavior is as expected. Author: Craig Ringer Reviewed-By: Oleksii Kliukin, Andres Freund, Petr Jelínek http://git.postgresql.org/pg/commitdiff/24c5f1a103ce6656a5cb430d9a996c34e61ab2a5
  • Type names should not be quoted. Our actual convention, contrary to what I said in 59a2111b23f, is not to quote type names, as evidenced by unquoted use of format_type_be() result value in error messages. Remove quotes from recently tweaked messages accordingly. Per note from Tom Lane http://git.postgresql.org/pg/commitdiff/f402b9950120358d1870aacc10070e121d8a17de
  • Improve internationalization of messages involving type names. Change the slightly different variations of the message function FOO must return type BAR to a single wording, removing the variability in type name so that they all create a single translation entry; since the type name is not to be translated, there's no point in it being part of the message anyway. Also, change them all to use the same quoting convention, namely that the function name is not to be quoted but the type name is. (I'm not quite sure why this is so, but it's the clear majority.) Some similar messages such as "encoding conversion function FOO must ..." are also changed. http://git.postgresql.org/pg/commitdiff/59a2111b23f6ceec4c777d68e20c1027d3c57c6f
  • Fix logical_decoding_timelines test crashes. In the test_slot_timelines test module, we were abusing passing NULL values which was received as zeroes in x86, but this breaks in ARM (buildfarm member hamster) by crashing instead. Fix the breakage by marking these functions as STRICT; the InvalidXid value that was previously implicit in NULL values (on x86 at least) can now be passed as 0. Failing to follow the fmgr protocol to check for NULLs beforehand was causing ARM to fail, as evidenced by segmentation faults in buildfarm member hamster. In order to use the new functionality in the test script, use COALESCE in the right spot to avoid forwarding NULL values. This was diagnosed from the hamster crash by Craig Ringer, who also proposed a different patch (checking for NULL values explicitely in the C function code, and keeping the non-strictness in the C functions). I decided to go with this approach instead. http://git.postgresql.org/pg/commitdiff/82c83b337202fa0f5b235bdfaeb992a5cee40ed5
  • pgbench: Remove unused parameter. For some reason this parameter was introduced as unused in 3da0dfb4b146, and has never been used for anything. Remove it. Author: Fabien Coelho http://git.postgresql.org/pg/commitdiff/5cb882675ae239db9d00b16a9467c4f900fb10b6
  • test_slot_timelines: Fix alternate expected output http://git.postgresql.org/pg/commitdiff/f07d18b6e94da6ef93dc4e00096f1e7542814fdb
  • XLogReader general code cleanup. Some minor tweaks and comment additions, for cleanliness sake and to avoid having the upcoming timeline-following patch be polluted with unrelated cleanup. Extracted from a larger patch by Craig Ringer, reviewed by Andres Freund, with some additions by myself. http://git.postgresql.org/pg/commitdiff/3b02ea4f0780ccce7dc116010201dad7ee50a401

Robert Haas pushed:

  • pgbench: Support double constants and functions. The new functions are pi(), random(), random_exponential(), random_gaussian(), and sqrt(). I was worried that this would be slower than before, but, if anything, it actually turns out to be slightly faster, because we now express the built-in pgbench scripts using fewer lines; each \setrandom can be merged into a subsequent \set. Fabien Coelho http://git.postgresql.org/pg/commitdiff/86c43f4e22c0771fd0cc6bce2799802c894ee2ec
  • Fix typo in comment. Thomas Munro http://git.postgresql.org/pg/commitdiff/bd0f206f5588767aac2456ebf6a21f7a6344cd58
  • On all Windows platforms, not just Cygwin, use _timezone and _tzname. Up until now, we've been using timezone and tzname, but Visual Studio 2015 (for which we wish to add support) no longer declares those symbols. All versions since Visual Studio 2003 apparently support the underscore-equipped names, and we don't support anything older than Visual Studio 2005, so this should work OK everywhere. But let's see what the buildfarm thinks. Michael Paquier, reviewed by Petr Jelinek http://git.postgresql.org/pg/commitdiff/868628e4fd44d75987d6c099ac63613cc5417629
  • Don't require a user mapping for FDWs to work. Commit fbe5a3fb73102c2cfec11aaaa4a67943f4474383 accidentally changed this behavior; put things back the way they were, and add some regression tests. Report by Andres Freund; patch by Ashutosh Bapat, with a bit of kibitzing by me. http://git.postgresql.org/pg/commitdiff/5d4171d1c70edfe3e9be1de9e66603af28e3afe1
  • Rework custom scans to work more like the new extensible node stuff. Per discussion, the new extensible node framework is thought to be better designed than the custom path/scan/scanstate stuff we added in PostgreSQL 9.5. Rework the latter to be more like the former. This is not backward-compatible, but we generally don't promise that for C APIs, and there probably aren't many people using this yet anyway. KaiGai Kohei, reviewed by Petr Jelinek and me. Some further cosmetic changes by me. http://git.postgresql.org/pg/commitdiff/f9143d102ffd0947ca904c62b1d3d6fd587e0c80
  • pgbench: Remove \setrandom. You can now do the same thing via \set using the appropriate function, either random(), random_gaussian(), or random_exponential(), depending on the desired distribution. This is not backward-compatible, but per discussion, it's worth it to avoid having the old syntax hang around forever. Fabien Coelho, reviewed by Michael Paquier, and adjusted by me. http://git.postgresql.org/pg/commitdiff/ad9566470b1ba63167d1dc7ae2cb52d88a448f76
  • Fix pgbench documentation error. The description of what the per-transaction log file says for skipped transactions is just plain wrong. Report and patch by Tomas Vondra, reviewed by Fabien Coelho and modified by me. http://git.postgresql.org/pg/commitdiff/d797bf7da2cc954f7b5cd2776b65c6e91cd0cb04
  • Improve pgbench docs regarding per-transaction logging. The old documentation didn't know about the new -b flag, only about -f. Fabien Coelho http://git.postgresql.org/pg/commitdiff/7f0a2c85fb221bae6908fb2fddad21a4c6d14438
  • Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me. http://git.postgresql.org/pg/commitdiff/5fe5a2cee91117673e04617aeb1a38e305dcd783
  • Fix bug in aggregate (de)serialization commit. resulttypeLen and resulttypeByVal must be set correctly when serializing aggregates, not just when finalizing them. This was in David's final patch but I downloaded the wrong version by mistake and failed to spot the error. David Rowley http://git.postgresql.org/pg/commitdiff/96f8373cad5d6066baeb7a1c5a88f6f5c9661974
  • Add new replication mode synchronous_commit = 'remote_apply'. In this mode, the master waits for the transaction to be applied on the remote side, not just written to disk. That means that you can count on a transaction started on the standby to see all commits previously acknowledged by the master. To make this work, the standby sends a reply after replaying each commit record generated with synchronous_commit >= 'remote_apply'. This introduces a small inefficiency: the extra replies will be sent even by standbys that aren't the current synchronous standby. But previously-existing synchronous_commit levels make no attempt at all to optimize which replies are sent based on what the primary cares about, so this is no worse, and at least avoids any extra replies for people not using the feature at all. Thomas Munro, reviewed by Michael Paquier and by me. Some additional tweaks by me. http://git.postgresql.org/pg/commitdiff/314cbfc5da988eff8998655158f84c9815ecfbcd

Magnus Hagander pushed:

Fujii Masao pushed:

Stephen Frost pushed:

  • Reset plan->row_security_env and planUserId. In the plancache, we check if the environment we planned the query under has changed in a way which requires us to re-plan, such as when the user for whom the plan was prepared changes and RLS is being used (and, therefore, there may be different policies to apply). Unfortunately, while those values were set and checked, they were not being reset when the query was re-planned and therefore, in cases where we change role, re-plan, and then change role again, we weren't re-planning again. This leads to potentially incorrect policies being applied in cases where role-specific policies are used and a given query is planned under one role and then executed under other roles, which could happen under security definer functions or when a common user and query is planned initially and then re-used across multiple SET ROLEs. Further, extensions which made use of CopyCachedPlan() may suffer from similar issues as the RLS-related fields were not properly copied as part of the plan and therefore RevalidateCachedQuery() would copy in the current settings without invalidating the query. Fix by using the same approach used for 'search_path', where we set the correct values in CompleteCachedPlan(), check them early on in RevalidateCachedQuery() and then properly reset them if re-planning. Also, copy through the values during CopyCachedPlan(). Pointed out by Ashutosh Bapat. Reviewed by Michael Paquier. Back-patch to 9.5 where RLS was introduced. Security: CVE-2016-2193 http://git.postgresql.org/pg/commitdiff/86ebf30fd6d8964bbd5d48db053b0a7ff709a0d7
  • Fix typo in pg_regress.c. s/afer/after Pointed out by Andreas 'ads' Scherbaum http://git.postgresql.org/pg/commitdiff/62b5cd234ba982f71f2501f405a26ed80c92a229

Noah Misch pushed:

Simon Riggs pushed:

  • Avoid pin scan for replay of XLOG_BTREE_VACUUM in all cases. Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the pin scan that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here and WAL is identical. The current commit applies in all cases, effectively replacing commit 687f2cd7a0150647794efe432ae0397cb41b60ff. http://git.postgresql.org/pg/commitdiff/3e4b7d87988f0835f137f15f5c1a40598dd21f3d

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Andres Freund and Michaël Paquier traded patches to remove a race condition that could have caused fsync to have been assumed to have worked by not actually to have been checked.

SAWADA Masahiko sent in three more revisions of a patch to allow having N>1 synchronous standby servers.

Alexander Korotkov sent in a patch to prevent a false alarm about xid wraparound.

Anastasia Lubennikova sent in another revision of a patch to implement covering + unique indexes.

Peter Geoghegan sent in a patch to add an amcheck extension to contrib.

Kyotaro HORIGUCHI sent in a patch for an issue in psql to fix the condition to decide whether to add schema names.

Etsuro Fujita sent in a patch to add GetFdwScanTupleExtraData() and FillFdwScanTupleSysAttr().

Thomas Munro sent in four more revisions of a patch to implement "causal reads."

Dilip Kumar and Robert Haas traded patches to help scale relation extension.

Kyotaro HORIGUCHI and Artur Zakirov traded patches to add tab completion support for IF [NOT] EXISTS to psql.

Alexander Korotkov sent in five more revisions of a patch to move PinBuffer and UnpinBuffer to atomics.

Thomas Munro sent in another revision of a patch to add kqueue support.

Christian Ullrich sent in another revision of a patch to fix an SSPI authentication failure where the wrong realm name was used.

Fabrízio de Royes Mello and Petr Jelínek traded patches to add a sequence access method and implement gapless sequences using same.

Teodor Sigaev sent in another revision of a patch to add ICU support.

Alexander Korotkov sent in another revision of a patch to add partial sort.

Tom Lane sent in a patch to fix an issue that manifested as pg_restore casting check constraints differently.

Amit Langote sent in a patch to perform constraint name uniqueness check for index constraints.

Michaël Paquier sent in a patch to add a missing mention of GSSAPI in MSVC's config_default.pl.

Dmitry Dolgov sent in two more revisions of a patch to add jsonb_insert().

Ian Lawrence Barwick sent in a patch to fix a replication slot creation error message in 9.6.

Andres Freund and Amit Kapila traded patches to speed up clog Access by increasing CLOG buffers.

Tomas Vondra and Dean Rasheed traded patches to improve GROUP BY estimation.

Pavel Stěhule sent in another revision of a patch to add a raw mode for COPY.

Aleksander Alekseev sent in a patch to implement a --disable-setproctitle flag.

Teodor Sigaev and Dmitry Ivanov traded patches to add a phrase search option to tsearch.

Bernd Helmle sent in a patch to fix an issue where a standalone backend can PANIC during recovery.

Andres Freund sent in another revision of a patch to avoid the use of a separate spinlock to protect LWLock's wait queue.

Peter Eisentraut sent in a patch to add an optional SSL indicator to the psql prompt.

Abhijit Menon-Sen sent in another revision of a patch to work around some issues in extension dependencies.

Stephen Frost sent in a patch to add new catalog called pg_init_privs, change pg_dump to use a bitmap to represent what to include, split "dump" into "dump" and "dump_contains", include pg_catalog and extension ACLs, if changed, and use the GRANT system to manage access to sensitive functions.

Kevin Grittner sent in another revision of a patch to implement "snapshot too old," configured by time.

David Rowley sent in two more revisions of a patch to improve performance for joins where outer side is unique.

Noah Misch sent in another revision of a patch to refer to a TOKEN_USER as "token user"

Fabien COELHO sent in a patch to patch which adds a bunch of operators (bitwise: & | ^ ~, comparisons: =/== <>/!= < <= > >=, logical: and/&& or/|| xor/^^ not/!) and functions (exp ln if) to pgbench.

Tom Lane sent in a patch to make a better MCV cutoff.

par N Bougain le lundi 4 avril 2016 à 21h50

lundi 28 mars 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 27 mars 2016

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en mars

PostgreSQL Local

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160327230616.GB29591@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Teach the configure script to validate its --with-pgport argument. Previously, configure would take any string, including an empty string, leading to obscure compile failures in guc.c. It seems worth expending a few lines of code to ensure that the argument is a decimal number between 1 and 65535. Report and patch by Jim Nasby; reviews by Alex Shulgin, Peter Eisentraut, Ivan Kartyshov http://git.postgresql.org/pg/commitdiff/bf53d5c208a3bdce243a38666fc50f5418c78c3b
  • pg_stat_get_progress_info() should be marked STRICT. I didn't bother with a catversion bump. Report and patch by Thomas Munro http://git.postgresql.org/pg/commitdiff/2da75499879032d8d2f233ca42cc2efe48fd76ef
  • Use repalloc_huge() to enlarge a SPITupleTable's tuple pointer array. Commit 23a27b039d94ba35 widened the rows-stored counters to uint64, but that's academic unless we allow the tuple pointer array to exceed 1GB. (It might be a good idea to provide some other limit on how much storage a SPITupleTable can eat. On the other hand, there are plenty of other ways to drive a backend into swap hell.) Dagfinn Ilmari MannsÃ¥ker http://git.postgresql.org/pg/commitdiff/74a379b984d4df91acec2436a16c51caee3526af
  • Improve conversions from uint64 to Perl types. Perl's integers are pointer-sized, so can hold more than INT_MAX on LP64 platforms, and come in both signed (IV) and unsigned (UV). Floating point values (NV) may also be larger than double. Since Perl 5.19.4 array indices are SSize_t instead of I32, so allow up to SSize_t_max on those versions. The limit is not imposed just by av_extend's argument type, but all the array handling code, so remove the speculative comment. Dagfinn Ilmari MannsÃ¥ker http://git.postgresql.org/pg/commitdiff/f3f3aae4b7841f4dc51129691a7404a03eb55449
  • Update PL/Perl's comment about hv_store(). Negative klen is documented since Perl 5.16, and 5.6 is no longer supported so no need to comment about it. Dagfinn Ilmari MannsÃ¥ker http://git.postgresql.org/pg/commitdiff/07341a2980a37ccbb3a51af2bd2f3c87953d8ea4
  • Rethink representation of PathTargets. In commit 19a541143a09c067 I did not make PathTarget a subtype of Node, and embedded a RelOptInfo's reltarget directly into it rather than having a separately-allocated Node. In hindsight that was misguided micro-optimization, enabled by the fact that at that point we didn't have any Paths with custom PathTargets. Now that PathTarget processing has been fleshed out some more, it's easier to see that it's better to have PathTarget as an indepedent Node type, even if it does cost us one more palloc to create a RelOptInfo. So change it while we still can. This commit just changes the representation, without doing anything more interesting than that. http://git.postgresql.org/pg/commitdiff/307c78852f516042cebacaed411a0391bfeb2129
  • Allow callers of create_foreignscan_path to specify nondefault PathTarget. Although the default choice of rel->reltarget should typically be sufficient for scan or join paths, it's not at all sufficient for the purposes PathTargets were invented for; in particular not for upper-relation Paths. So break API compatibility by adding a PathTarget argument to create_foreignscan_path(). To ease updating of existing code, accept a NULL value of the argument as selecting rel->reltarget. http://git.postgresql.org/pg/commitdiff/28048cbaa285b8ac46940e4b39f985d9885fc698
  • Provide a planner hook at a suitable place for creating upper-rel Paths. In the initial revision of the upper-planner pathification work, the only available way for an FDW or custom-scan provider to inject Paths representing post-scan-join processing was to insert them during scan-level GetForeignPaths or similar processing. While that's not impossible, it'd require quite a lot of duplicative processing to look forward and see if the extension would be capable of implementing the whole query. To improve matters for custom-scan providers, provide a hook function at the point where the core code is about to start filling in upperrel Paths. At this point Paths are available for the whole scan/join tree, which should reduce the amount of redundant effort considerably. (An alternative design that was suggested was to provide a separate hook for each post-scan-join processing step, but that seems messy and not clearly more useful.) Following our time-honored tradition, there's no documentation for this hook outside the source code. As-is, this hook is only meant for custom scan providers, which we can't assume very much about. A followon patch will implement an FDW callback to let FDWs do the same thing in a somewhat more structured fashion. http://git.postgresql.org/pg/commitdiff/5864d6a4b62ada2ad60a8c456b4ee62972a9c10d
  • Add a GetForeignUpperPaths callback function for FDWs. This is basically like the just-added create_upper_paths_hook, but control is funneled only to the FDW responsible for all the baserels of the current query; so providing such a callback is much less likely to add useless overhead than using the hook function is. The documentation is a bit sketchy. We'll likely want to improve it, and/or adjust the call conventions, when we get some experience with actually using this callback. Hopefully somebody will find time to experiment with it before 9.6 feature freeze. http://git.postgresql.org/pg/commitdiff/101fd9349eddb7e9ed84a239145d5230a9bc7336
  • Cope if platform declares mbstowcs_l(), but not locale_t, in <xlocale.h>. Previously, we included <xlocale.h> only if necessary to get the definition of type locale_t. According to notes in PGAC_TYPE_LOCALE_T, this is important because on some versions of glibc that file supplies an incompatible declaration of locale_t. (This info may be obsolete, because on my RHEL6 box that seems to be the *only* definition of locale_t; but there may still be glibc's in the wild for which it's a live concern.) It turns out though that on FreeBSD and maybe other BSDen, you can get locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from <xlocale.h>. This was leaving us compiling calls to mbstowcs_l() and friends with no visible prototype, which causes a warning and could possibly cause actual trouble, since it's not declared to return int. Hence, adjust the configure checks so that we'll include <xlocale.h> either if it's necessary to get type locale_t or if it's necessary to get a declaration of mbstowcs_l(). Report and patch by Aleksander Alekseev, somewhat whacked around by me. Back-patch to all supported branches, since we have been using mbstowcs_l() since 9.1. http://git.postgresql.org/pg/commitdiff/0e9b89986b7ced6daffdf14638a25a35c45423ff
  • Be more careful about out-of-range dates and timestamps. Tighten the semantics of boundary-case timestamptz so that we allow timestamps >= '4714-11-24 00:00+00 BC' and < 'ENDYEAR-01-01 00:00+00 AD' exactly, no more and no less, but it is allowed to enter timestamps within that range using non-GMT timezone offsets (which could make the nominal date 4714-11-23 BC or ENDYEAR-01-01 AD). This eliminates dump/reload failure conditions for timestamps near the endpoints. To do this, separate checking of the inputs for date2j() from the final range check, and allow the Julian date code to handle a range slightly wider than the nominal range of the datatypes. Also add a bunch of checks to detect out-of-range dates and timestamps that formerly could be returned by operations such as date-plus-integer. All C-level functions that return date, timestamp, or timestamptz should now be proof against returning a value that doesn't pass IS_VALID_DATE() or IS_VALID_TIMESTAMP(). Vitaly Burovoy, reviewed by Anastasia Lubennikova, and substantially whacked around by me http://git.postgresql.org/pg/commitdiff/a70e13a39eccf5fc944c66e0029004b6abcb3cae
  • Fix j2day() to behave sanely for negative Julian dates. Somebody had apparently once figured that casting to unsigned int would produce the right output for negative inputs, but that would only be true if 2^32 were a multiple of 7, which of course it ain't. We need to use a signed division and then correct the sign of the remainder. AFAICT, the only case where this would arise currently is when doing ISO-week calculations for dates in 4714BC, where we'd compute a negative Julian date representing 4714-01-04BC and then do some arithmetic with it. Since we don't even really document support for such dates, this is not of much consequence. But we may as well get it right. Per report from Vitaly Burovoy. http://git.postgresql.org/pg/commitdiff/5db51464311eb7fe4e90030c6a514ff61e9f1c00
  • Fix "pg_bench -C -M prepared". This didn't work because when we dropped and re-established a database connection, we did not bother to reset session-specific state such as the statements-are-prepared flags. The st->prepared[] array certainly needs to be flushed, and I cleared a couple of other fields as well that couldn't possibly retain meaningful state for a new connection. In passing, fix some bogus comments and strange field order choices. Per report from Robins Tharakan. http://git.postgresql.org/pg/commitdiff/47211af17a2dbee38b53b2ea6de81499dbb2c7f5
  • Fix assorted breakage in to_char()'s OF format option. In HEAD, fix incorrect field width for hours part of OF when tm_gmtoff is negative. This was introduced by commit 2d87eedc1d4468d3 as a result of falsely applying a pattern that's correct when + signs are omitted, which is not the case for OF. In 9.4, fix missing abs() call that allowed a sign to be attached to the minutes part of OF. This was fixed in 9.5 by 9b43d73b3f9bef27, but for inscrutable reasons not back-patched. In all three versions, ensure that the sign of tm_gmtoff is correctly reported even when the GMT offset is less than 1 hour. Add regression tests, which evidently we desperately need here. Thomas Munro and Tom Lane, per report from David Fetter http://git.postgresql.org/pg/commitdiff/55c3a04d60ccea9e999088fb847ceeb9fd4dd927
  • Remove useless double calls of make_parsestate(). Aleksander Alekseev http://git.postgresql.org/pg/commitdiff/bd0ab28912d7502b237b8aeb95d052abe4ff6bc6
  • Clean up some misplaced #includes. Random .h files have no business including postgres-fe.h (or postgres.h). If that wasn't the first #include done by the calling .c file, it's the .c file that's broken. Noted while prepping Kyotaro Horiguchi's psql lexer refactoring patch. http://git.postgresql.org/pg/commitdiff/3422fecccadb021b7b4cdbc73b2c29f66f031761
  • Decouple psqlscan.l from surrounding program. Remove assorted external references from psqlscan.l in preparation for making it usable by other frontend programs. This mostly involves getting rid of direct calls to psql_error() and GetVariable() in favor of introducing a callback-functions struct to encapsulate variable fetching and error printing. In addition, pass the current encoding and standard-strings status as additional parameters to psql_scan_setup instead of looking directly at "pset" or calling additional functions. I did not bother to change some references to psql_error that are in functions that will soon migrate to a psql-specific backslash-command lexer. Other than that, this version of psqlscan.l is capable of compiling standalone. It still depends on assorted src/common functions as well as some encoding-related libpq functions, but we expect that all programs using it will be happy with those dependencies. Kyotaro Horiguchi, somewhat editorialized on by me http://git.postgresql.org/pg/commitdiff/4e1d2a170836028370675922ea9a690648d3c18d
  • Convert psql's flex lexer to be re-entrant, and make it compile standalone. Change psqlscan.l to specify '%option reentrant', adjust internal APIs to match, and get rid of its internal static variables. While this is good cleanup in an abstract sense, the reason to do it right now is that it seems the only practical way to support use of separate flex lexers with common PsqlScanState infrastructure. If we build two non-reentrant lexers then we are going to have problems with dangling buffer pointers in whichever lexer isn't active when we transition from one buffer to another, as well as curious side-effects if we try to share any code between the files. (Horiguchi-san had a different solution to that in his pending patch, but I find it ugly and probably broken for corner cases.) Depending on which version of flex you're using, this may result in getting a "warning: unused variable 'yyg'" warning from psqlscan, similar to the one you'd have seen for a long time in backend/parser/scan.l. I put a local -Wno-error into CFLAGS for the file, for the convenience of those who compile with -Werror. Also, stop compiling psqlscan as part of mainloop.c, and make it a standalone build target instead. This is a lot cleaner than before, though it doesn't really change much in practice as of this commit. (I'm not sure whether the MSVC build scripts will need some help with this part, but the buildfarm will soon tell us.) http://git.postgresql.org/pg/commitdiff/27199058d98ef7ff2f468af44654bc35bb70fe4a
  • Split psql's lexer into two separate .l files for SQL and backslash cases. This gets us to a point where psqlscan.l can be used by other frontend programs for the same purpose psql uses it for, ie to detect when it's collected a complete SQL command from input that is divided across line boundaries. Moreover, other programs can supply their own lexers for backslash commands of their own choosing. A follow-on patch will use this in pgbench. The end result here is roughly the same as in Kyotaro Horiguchi's 0001-Make-SQL-parser-part-of-psqlscan-independent-from-ps.patch, although the details of the method for switching between lexers are quite different. Basically, in this patch we share the entire PsqlScanState, YY_BUFFER_STATE stack, *and* yyscan_t between different lexers. The only thing we need to do to switch to a different lexer is to make sure the start_state is valid for the new lexer. This works because flex doesn't keep any other persistent state that depends on the specific lexing tables generated for a particular .l file. (We are assuming that both lexers are built with the same flex version, or at least versions that are compatible with respect to the contents of yyscan_t; but that doesn't seem likely to be a big problem in practice, considering how slowly flex changes.) Aside from being more efficient than Horiguchi-san's original solution, this avoids possible corner-case changes in semantics: the original code was capable of popping the input buffer stack while still staying in backslash-related parsing states. I'm not sure that that equates to any useful user-visible behaviors, but I'm not sure it doesn't either, so I'm loath to assume that we only need to consider the topmost buffer when parsing a backslash command. I've attempted to update the MSVC build scripts for the added .l file, but will rely on the buildfarm to see if I missed anything. Kyotaro Horiguchi and Tom Lane http://git.postgresql.org/pg/commitdiff/0ea9efbe9ec1bf07cc6ae070bdd54700af08e44d
  • Suppress FLEX_NO_BACKUP check for psqlscanslash.l. The existing infrastructure for FLEX_NO_BACKUP doesn't work reliably when two lexers are built in parallel in the same directory. We can probably fix that, but as a short-term workaround, just don't make the check for psqlscanslash.l. Per buildfarm. http://git.postgresql.org/pg/commitdiff/a3e39f83632935911bc159154a33e89495f4a676
  • Use yylex_init not yylex_init_extra(). Older versions of flex don't have the latter. Per buildfarm. http://git.postgresql.org/pg/commitdiff/ff0a7e6167f475672d82d1cd7cd0d5e735154c4d
  • Fix missed update in _readForeignScan(). Blatant fail in 0bf3ae88af330496517722e391e7c975e6bad219. Caught by buildfarm member mandrill. http://git.postgresql.org/pg/commitdiff/07aed46a6b3994508e5674301c85ebf5807905ea
  • With ancient gcc, skip pg_attribute_printf() on function pointer. Buildfarm results show that the ability to attach pg_attribute_printf decoration to a function pointer appeared somewhere between gcc 2.95.3 and gcc 4.0.1. Guess that it was there in 4.0. http://git.postgresql.org/pg/commitdiff/b46d9beb658af7eb4e2a08dfa34206a117c9654f
  • Build backend/parser/scan.l and interfaces/ecpg/preproc/pgc.l standalone. Now that we know about the %top{} trick, we can revert to building flex lexers as separate .o files. This is worth doing for a couple of reasons besides sheer cleanliness. We can narrow the scope of the -Wno-error flag that's forced on scan.c. Also, since these grammar and lexer files are so large, splitting them into separate build targets should have some advantages in build speed, particularly in parallel or ccache'd builds. We have quite a few other .l files that could be changed likewise, but the above arguments don't apply to them, so the benefit of fixing them seems pretty minimal. Leave the rest for some other day. http://git.postgresql.org/pg/commitdiff/72b1e3a21f0540ffa5c1f8f474b6c52097a368bb
  • Typo fix. http://git.postgresql.org/pg/commitdiff/78e7c4439917b01afd645a2ec657008ba6c33d37
  • Sync backend/parser/scan.l with bin/psql/psqlscan.l. Make some minor formatting adjustments to make it easier to diff these files and see that they indeed implement the same flex rules (at least to the extent that we want them to be the same). (Someday it'd be nice to make ecpg's pgc.l more easily diff'able too, but today is not that day.) Also run relevant parts of these files and psqlscanslash.l through pgindent. No actual behavioral changes here, just obsessive neatnik-ism. http://git.postgresql.org/pg/commitdiff/21c8ee79464a180ab0257abdfceae89274a46632
  • Fix phony .PHONY. A couple makefiles had misspelled the magic .PHONY target as PHONY. http://git.postgresql.org/pg/commitdiff/d5351fcb03fc8e20651d5863b88b397a8be68d74
  • SQL commands in pgbench scripts are now ended by semicolons, not newlines. To allow multiline SQL commands in scripts, adopt the same rules psql uses to decide what is the end of a SQL command, to wit, an unquoted semicolon not encased in parentheses. Do this by importing the same flex lexer that psql uses, since coping with stuff like dollar-quoted literals is hard to get right without going the full nine yards. This makes use of the infrastructure added in commit 0ea9efbe9ec1bf07 to support independently-written flex lexers scanning the same PsqlScanState input-buffer data structure. Since that infrastructure isn't very friendly to ad-hoc parsing code such as strtok(), improve exprscan.l so that it can parse either whitespace-separated words or expression tokens, on demand, and rewrite pgbench.c's backslash-command parsing code to always use the lexer to fetch tokens. It's still the case that pgbench backslash commands extend to the end of the line, no more and no less. That could be changed in a fairly localized way now, and there was some interest in doing so, but it seems like material for a separate patch. In passing, make some marginal cleanups in syntax error reporting, const-ify a few data structures that could use it, and run some of this code through pgindent. I can't tell whether the MSVC build scripts need to be taught explicitly about the changes here or not, but the buildfarm will soon tell us. Kyotaro Horiguchi and Tom Lane http://git.postgresql.org/pg/commitdiff/68ab8e8ba4a471d91b69f2f89782ba10a0fbef0c
  • Make pgbench's expression lexer reentrant. This is a necessary preliminary step for making it play with psqlscan.l given the way I set up the lexer input-buffer sharing mechanism in commit 0ea9efbe9ec1bf07. I've not tried to make it *actually* reentrant; there's still some static variables laying about. But flex thinks it's reentrant, and that's what counts. In support of that, fix exprparse.y to pass through the yyscan_t from the caller. Also do some minor code beautification, like not casting away const. http://git.postgresql.org/pg/commitdiff/429ee5a822db0e8faf669d77c810f1eeaaff1ab4
  • Best-guess attempt at fixing MSVC build for 68ab8e8ba4a471d9. pgbench now needs to use src/bin/psql/psqlscan.l, but it's not very clear how to fit that into the MSVC build system. If this doesn't work I'm going to need some help from somebody who actually understands those scripts ... http://git.postgresql.org/pg/commitdiff/6f1f34c92b11593ec62ff3e12781eb96dc911821
  • Use %option bison-bridge in psql/pgbench lexers. The point of this change is to use %pure-parser in pgbench's exprparse.y. The immediate reason is that it turns out very ancient versions of bison have a bug with the combination of a reentrant lexer and non-reentrant parser. We could consider dropping support for such ancient bisons; but considering that we might well need exprparse.y to be reentrant some day, it seems better to make it so right now than to move the portability goalposts. (AFAICT there's no particular performance consequence to this change, either, so there's no good reason not to do it.) Now, %pure-parser assumes that the called lexer is built with %option bison-bridge. Because we're assuming bitwise compatibility of yyscan_t (yyguts_t) data structures among all the psql/pgbench lexers, that requirement propagates back to psql's lexers as well. But it's just a few lines of change on that side too; and if psqlscan.l is to set the baseline for a possibly-large family of lexers, it should err on the side of including not omitting useful features. http://git.postgresql.org/pg/commitdiff/b6afae71aaf6d2df76d0a0a77c8b630220a01ec1
  • lean up some Coverity complaints about commit 0bf3ae88af330496. The two get_tle_by_resno() calls introduced by this commit lacked any check for a NULL return, unlike any other calls of that function anywhere in our tree. Coverity quite properly complained about it. Also fix a misindented line in process_query_params(), which Coverity also complained about on the grounds that the bad indentation suggested possible programmer misinterpretation. http://git.postgresql.org/pg/commitdiff/92b7902deb3155f6975f33e8b6c8be4d9d066172
  • Allow the delay in psql's \watch command to be a fractional second. Instead of just "2" seconds, allow eg. "2.5" seconds. Per request from Alvaro Herrera. No docs change since the docs didn't say you couldn't do this already. http://git.postgresql.org/pg/commitdiff/b283096534b9c514a92a70c98c033015b6792ba7
  • Improve header output from psql's \watch command. Include the \pset title string if there is one, and shorten the prefab part of the header to be "timestamp (every Ns)". Per suggestion by David Johnston. Michael Paquier and Tom Lane http://git.postgresql.org/pg/commitdiff/dea2b5960a9460c02896ed361d35e92bce02801a
  • Fix EvalPlanQual bug when query contains both locked and not-locked rels. In commit afb9249d06f47d7a, we (probably I) made ExecLockRows assign null test tuples to all relations of the query while setting up to do an EvalPlanQual recheck for a newly-updated locked row. This was sheerest brain fade: we should only set test tuples for relations that are lockable by the LockRows node, and in particular empty test tuples are only sensible for inheritance child relations that weren't the source of the current tuple from their inheritance tree. Setting a null test tuple for an unrelated table causes it to return NULLs when it should not, as exhibited in bug #14034 from Bronislav Houdek. To add insult to injury, doing it the wrong way required two loops where one would suffice; so the corrected code is even a bit shorter and faster. Add a regression test case based on his example, and back-patch to 9.5 where the bug was introduced. http://git.postgresql.org/pg/commitdiff/71404af2a29ce4a3a5907cdc8b893ec2bc0285b4
  • Fix unsafe use of strtol() on a non-null-terminated Text datum. jsonb_set() could produce wrong answers or incorrect error reports, or in the worst case even crash, when trying to convert a path-array element into an integer for use as an array subscript. Per report from Vitaly Burovoy. Back-patch to 9.5 where the faulty code was introduced (in commit c6947010ceb42143). Michael Paquier http://git.postgresql.org/pg/commitdiff/384dfbde19330541f7fb487f9352949aa06c812e
  • Code review for error reports in jsonb_set(). User-facing (even tested by regression tests) error conditions were thrown with elog(), hence had wrong SQLSTATE and were untranslatable. And the error message texts weren't up to project style, either. http://git.postgresql.org/pg/commitdiff/ea4b8bd6188ecb17ba37d93f57b8b020a964e66c
  • Move keywords.c/kwlookup.c into src/common/. Now that we have src/common/ for code shared between frontend and backend, we can get rid of (most of) the klugy ways that the keyword table and keyword lookup code were formerly shared between different uses. This is a first step towards a more general plan of getting rid of special-purpose kluges for sharing code in src/bin/. I chose to merge kwlookup.c back into keywords.c, as it once was, and always has been so far as keywords.h is concerned. We could have kept them separate, but there is noplace that uses ScanKeywordLookup without also wanting access to the backend's keyword list, so there seems little point. ecpg is still a bit weird, but at least now the trickiness is documented. I think that the MSVC build script should require no adjustments beyond what's done here ... but we'll soon find out. http://git.postgresql.org/pg/commitdiff/2c6af4f44228d76d3351fe26f68b00b55cdd239a
  • Avoid PGDLLIMPORT for simple local references in frontend programs. I was wondering if this would be an issue, and buildfarm member frogmouth says it is. http://git.postgresql.org/pg/commitdiff/c2d1eea9e750edb267e3f071a129e03d79ad198b
  • Create src/fe_utils/, and move stuff into there from pg_dump's dumputils. Per discussion, we want to create a static library and put the stuff into it that until now has been shared across src/bin/ directories by ad-hoc methods like symlinking a source file. This commit creates the library and populates it with a couple of files that contain the widely-useful portions of pg_dump's dumputils.c file. dumputils.c survives, because it has some stuff that didn't seem appropriate for fe_utils, but it's significantly smaller and is no longer referenced from any other directory. Follow-on patches will move more stuff into fe_utils. The Mkvcbuild.pm hacking here is just a best guess; we'll see how the buildfarm likes it. http://git.postgresql.org/pg/commitdiff/588d963b00e5e4385b6425418e3faa726f63f72e
  • Add missed inclusion requirement in Mkvcbuild.pm. Per buildfarm. http://git.postgresql.org/pg/commitdiff/0ecd3fedfcf3427ebeb73cc61b2fcf6ed67c43a2
  • Suppress compiler warning for get_am_type_string(). Compilers that don't know that elog(ERROR) doesn't return complained that this function might fail to return a value. Per buildfarm. While at it, const-ify the function's declaration, since the intent is evidently to always return a constant string. http://git.postgresql.org/pg/commitdiff/a376960c8f8ec08783e1c529f36fbeb60236b378
  • Move psql's print.c and mbprint.c into src/fe_utils. Just turning the crank ... http://git.postgresql.org/pg/commitdiff/d65bea26a867e3bbd053bf87b985b0e113256414
  • Move psql's psqlscan.l into src/fe_utils. This completes (at least for now) the project of getting rid of ad-hoc linkages among the src/bin/ subdirectories. Everything they share is now in src/fe_utils/ and is included from a static library at link time. A side benefit is that we can restore the FLEX_NO_BACKUP check for psqlscanslash.l. We might need to think of another way to do that check if we ever need to build two lexers with that property in the same source directory, but there's no foreseeable reason to need that. http://git.postgresql.org/pg/commitdiff/c1156411ad0879a71956b64aa487babe7572685b
  • Link libpq after libpgfeutils to satisfy Windows linker. Some of the non-MSVC Windows buildfarm members seem to need this to avoid getting "undefined symbol" errors on libpgfeutils' references to libpq. I could understand that if libpq were a static library, but surely it is not? Oh well, at least the extra reference is no more harmful than it is for libpgcommon or libpgport. http://git.postgresql.org/pg/commitdiff/7caaeaf3607fae91318f24debce3dc017ca299a3
  • Don't split up SRFs when choosing to postpone SELECT output expressions. In commit 9118d03a8cca3d97 we taught the planner to postpone evaluation of set-returning functions in a SELECT's targetlist until after any sort done to satisfy ORDER BY. However, if we postpone some SRFs this way while others do not get postponed (because they're sort or group key columns) we will break the traditional behavior by which all SRFs in the tlist run in-step during ExecTargetList(), so that you get the least common multiple of their periods not the product. Fix make_sort_input_target() so it will not split up SRF evaluation in such cases. There is still a hazard of similar odd behavior if there's a SRF in a grouping column and another one that isn't, but that was true before and we're just trying to preserve bug-compatibility with the traditional behavior. This whole area is overdue to be rethought and reimplemented, but we'll try to avoid changing behavior until then. Per report from Regina Obe. http://git.postgresql.org/pg/commitdiff/d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7
  • Fix DROP OPERATOR to reset oprcom/oprnegate links to the dropped operator. This avoids leaving dangling links in pg_operator; which while fairly harmless are also unsightly. While we're at it, simplify OperatorUpd, which went through heap_modify_tuple for no very good reason considering it had already made a tuple copy it could just scribble on. Roma Sokolov, reviewed by Tomas Vondra, additional hacking by Robert Haas and myself. http://git.postgresql.org/pg/commitdiff/c94959d4110a1965472956cfd631082a96f64a84
  • In PL/Tcl, make database errors return additional info in the errorCode. Tcl has a convention for returning additional info about an error in a global variable named errorCode. Up to now PL/Tcl has ignored that, but this patch causes database errors caught by PL/Tcl to fill in errorCode with useful information from the ErrorData struct. Jim Nasby, reviewed by Pavel Stehule and myself http://git.postgresql.org/pg/commitdiff/fb8d2a7f57d87102f0a95025fbf1cad9c341739b
  • Improve PL/Tcl errorCode facility by providing decoded name for SQLSTATE. We don't really want to encourage people to write numeric SQLSTATEs in programs; that's unreadable and error-prone. Copy plpgsql's infrastructure for converting between SQLSTATEs and exception names shown in Appendix A, and modify examples in tests and documentation to do it that way. http://git.postgresql.org/pg/commitdiff/cd37bb78599dcf24cd22a124ce9174b5e2a76880
  • Fix PL/Tcl for vpath builds. Commit cd37bb78599dcf24 works for in-tree builds, but not so much for VPATH. Per buildfarm. http://git.postgresql.org/pg/commitdiff/9f73a2f6d1c1305cf0dc749dbf631cffe26beda0
  • Update time zone data files to tzdata release 2016c. DST law changes in Azerbaijan, Chile, Haiti, Palestine, and Russia (Altai, Astrakhan, Kirov, Sakhalin, Ulyanovsk regions). Historical corrections for Lithuania, Moldova, Russia (Kaliningrad, Samara, Volgograd). As of 2015b, the keepers of the IANA timezone database started to use numeric time zone abbreviations (e.g., "+04") instead of inventing abbreviations not found in the wild like "ASTT". This causes our rather old copy of zic to whine "warning: time zone abbreviation differs from POSIX standard" several times during "make install". This warning is harmless according to the IANA folk, and I don't see any problems with these abbreviations in some simple tests; but it seems like now would be a good time to update our copy of the tzcode stuff. I'll look into that soon. http://git.postgresql.org/pg/commitdiff/676265eb7b57ba5bfae859630b909e6045893b68
  • Avoid a couple of zero-divide scenarios in the planner. cost_subplan() supposed that the given subplan must have plan_rows > 0, which as far as I can tell was true until recent refactoring of the code in createplan.c; but now that code allows the Result for a provably empty subquery to have plan_rows = 0. Rather than undo that change, put in a clamp to prevent zero divide. get_cheapest_fractional_path() likewise supposed that best_path->rows > 0. This assumption has been wrong for longer. It's actually harmless given IEEE float math, because a positive value divided by zero gives +Infinity and compare_fractional_path_costs() will do the right thing with that. Still, best not to assume that. final_cost_nestloop() also seems to have some risks in this area, so borrow the clamping logic already present in the mergejoin cost functions. Lastly, remove unnecessary clamp_row_est() in planner.c's calls to get_number_of_groups(). The only thing that function does with path_rows is pass it to estimate_num_groups() which already has an internal clamp, so we don't need the extra call; and if we did, the callers are arguably the wrong place for it anyway. First two items reported by Piotr Stefaniak, the others are products of my nosing around for similar problems. No back-patch since there's no evidence that problems arise in the back branches. http://git.postgresql.org/pg/commitdiff/76281aa9647e6a5dfc646514554d0f519e3b8a58
  • Modernize zic's test for valid timezone abbreviations. We really need to sync all of our IANA-derived timezone code with upstream, but that's going to be a large patch and I certainly don't care to shove such a thing into stable branches immediately before a release. As a stopgap, copy just the tzcode2016c logic that checks validity of timezone abbreviations. This prevents getting multiple "time zone abbreviation differs from POSIX standard" bleats with tzdata 2014b and later. http://git.postgresql.org/pg/commitdiff/221619ad69b7e060041796a1974fbb0eeb9542d7
  • First-draft release notes for 9.5.2. As usual, the release notes for other branches will be made by cutting these down, but put them up for community review first. http://git.postgresql.org/pg/commitdiff/29b6123ecb4113e366325245cec5a5c221dae691

Robert Haas pushed:

Peter Eisentraut pushed:

Ãlvaro Herrera pushed:

Stephen Frost pushed:

  • Avoid incorrectly indicating exclusion constraint wait. INSERT ... ON CONFLICT's precheck may have to wait on the outcome of another insertion, which may or may not itself be a speculative insertion. This wait is not necessarily associated with an exclusion constraint, but was always reported that way in log messages if the wait happened to involve a tuple that had no speculative token. Initially discovered through use of ON CONFLICT DO NOTHING, where spurious references to exclusion constraints in log messages were more likely. Patch by Peter Geoghegan. Reviewed by Julien Rouhaud. Back-patch to 9.5 where INSERT ... ON CONFLICT was added. http://git.postgresql.org/pg/commitdiff/fd658dbb300456b393536802d1145a9cea7b25d6

Teodor Sigaev pushed:

Andres Freund pushed:

  • Error out if waiting on socket readiness without a specified socket. Previously we just ignored such an attempt, but that seems to serve no purpose but making things harder to debug. Discussion: 20160114143931.GG10941@awork2.anarazel.de 20151230173734.hx7jj2fnwyljfqek@alap3.anarazel.de Reviewed-By: Robert Haas http://git.postgresql.org/pg/commitdiff/6bc4d95fcc2a432fc202cba03d5393be096f0422
  • Only clear latch self-pipe/event if there is a pending notification. This avoids a good number of, individually quite fast, system calls in scenarios with many quick queries. Besides the aesthetic benefit of seing fewer superflous system calls with strace, it also improves performance by ~2% measured by pgbench -M prepared -c 96 -j 8 -S (scale 100). Without having benchmarked it, this patch also adjust the windows code, as that makes it easier to unify the unix/windows codepaths in a later patch. There's little reason to diverge in behaviour between the platforms. Discussion: CA+TgmoYc1Zm+Szoc_Qbzi92z2c1vRHZmjhfPn5uC=w8bXv6Avg@mail.gmail.com Reviewed-By: Robert Haas http://git.postgresql.org/pg/commitdiff/c4901a1e03a7730e4471fd1143f1caf79695493d
  • Remove unused, and dangerous, TestLatch() macro. The macro has not seen any in-tree use since latches had been introduced in 2746e5f, in 2010. http://git.postgresql.org/pg/commitdiff/fad0f9d8c9f6a8e99156b8f01cba54be39f31761
  • Make it easier to choose the used waiting primitive in unix_latch.c. This allows for easier testing of the different primitives; in preparation for adding a new primitive. Discussion: 20160114143931.GG10941@awork2.anarazel.de Reviewed-By: Robert Haas http://git.postgresql.org/pg/commitdiff/c17966201c7de2a4c437bed6d83c6a7f2e7108f4
  • Fix stupid omission in c4901a1e. Reported-By: Jeff Janes Discussion: CAMkU=1zGxREwoyaCrp_CHadEB+dPgpVyKBysCJ+6xP9gCOvAuw@mail.gmail.com http://git.postgresql.org/pg/commitdiff/6eb2be15b5d24b98d334a9dd637f0edb37e2eb7e
  • Second attempt at fixing MSVC build for 68ab8e8ba4a471d9. After the previous fix in 6f1f34c9 msvc ended up looking for psqlscan.c in the wrong directory. David's fix just forces the path to be adjusted. That's not a particularly pretty fix, but it hopefully will make the buildfarm green again. Author: David Rowley Discussion: CAKJS1f_9CCi_t+LEgV5GWoCj3wjavcMoDc5qfcf_A0UwpQoPoA@mail.gmail.com http://git.postgresql.org/pg/commitdiff/326d73c86fda407a810675c3b5a48e0a0cc992f5
  • Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de http://git.postgresql.org/pg/commitdiff/98a64d0bd713cb89e61bef6432befc4b7b5da59e
  • Combine win32 and unix latch implementations. Previously latches for windows and unix had been implemented in different files. A later patch introduce an expanded wait infrastructure, keeping the implementation separate would introduce too much duplication. This basically just moves the functions, without too much change. The reason to keep this separate is that it allows blame to continue working a little less badly; and to make review a tiny bit easier. Discussion: 20160114143931.GG10941@awork2.anarazel.de http://git.postgresql.org/pg/commitdiff/72e2d21c1249b674496f97cd6009c0bda62f6b4d
  • Properly declare FeBeWaitSet. Surprising that this worked on a number of systems. Reported by buildfarm member longfin. http://git.postgresql.org/pg/commitdiff/7fa0064092e135415a558dc3c4d7393d14ab6d8e
  • Change various Gin*Is* macros to return 0/1. Returning the direct result of bit arithmetic, in a macro intended to be used in a boolean manner, can be problematic if the return value is stored in a variable of type 'bool'. If bool is implemented using C99's _Bool, that can lead to comparison failures if the variable is then compared again with the expression (see ginStepRight() for an example that fails), as _Bool forces the result to be 0/1. That happens in some configurations of newer MSVC compilers. It's also problematic when storing the result of such an expression in a narrower type. Several gin macros have been declared in that style since gin's initial commit in 8a3631f8d86. There's a lot more macros like this, but this is the only one causing regression test failures; and I don't want to commit and backpatch a larger patch with lots of conflicts just before the next set of minor releases. Discussion: 20150811154237.GD17575@awork2.anarazel.de Backpatch: All supported branches http://git.postgresql.org/pg/commitdiff/af4472bcb88ab36b9abbe7fd5858e570a65a2d1a
  • Don't use !! but != 0/NULL to force boolean evaluation. I introduced several uses of !! to force bit arithmetic to be boolean, but per discussion the project prefers != 0/NULL. Discussion: CA+TgmoZP5KakLGP6B4vUjgMBUW0woq_dJYi0paOz-My0Hwt_vQ@mail.gmail.com http://git.postgresql.org/pg/commitdiff/1a7a43672bf2939dda93a27d498349e7a4aa3c14

Andrew Dunstan pushed:

  • Remove dependency on psed for MSVC builds. Modern Perl has removed psed from its core distribution, so it might not be readily available on some build platforms. We therefore replace its use with a Perl script generated by s2p, which is equivalent to the sed script. The latter is retained for non-MSVC builds to avoid creating a new hard dependency on Perl for non-Windows tarball builds. Backpatch to all live branches. Michael Paquier and me. http://git.postgresql.org/pg/commitdiff/5d0320105699c253fe19b8b42ae1bffb67785b02

Fujii Masao pushed:

Simon Riggs pushed:

Ãlvaro Herrera pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Etsuro Fujita sent in a patch to expose umid in foreign tables.

Konstantin Knizhnik sent in another revision of a patch to extend ALTER INDEX to allow changing the predicate of a partial index.

Stephen Frost sent in two more revisions of a patch to add default roles.

Tomas Vondra sent in another revision of a patch to use foreign keys to improve join estimates.

Peter Geoghegan sent in a patch to distrust external OpenSSL clients and clear err queue.

Amit Kapila sent in another revision of a patch to prevent prepared statements from inappriately enabling parallelism.

Ãlvaro Herrera sent in another revision of a patch to allow logical slots to follow timeline switches.

Amit Kapila sent in another revision of a patch to group the status update for transactions.

Kaigai Kouhei sent in another revision of a patch to rework custom scan serialization.

Anastasia Lubennikova sent in another revision of a patch to implement covering unique indexes.

Michaël Paquier sent in two more revisions of a patch to implement SCRAM authentication with infrastructure for making auth more pluggable.

Corey Huinker sent in another revision of a patch to implement \gexec in psql.

Robbie Harwood sent in two more revisions of a patch to add GSSAPI encryption support.

David Rowley and Julien Rouhaud traded patches to enable setting parallel degree.

Alexander Shulgin sent in another revision of a patch to add \errverbose psql command and support in libpq.

Michaël Paquier sent in another revision of a patch to fix hot standby checkpoints.

Ashutosh Bapat sent in two revisions of a patch to require user mappings for foreign joins if the tables involved would have required same.

Oskari Saarenmaa sent in a patch to display backends for dropped roles in pg_stat_activity.

Craig Ringer sent in two more revisions of a patch to implement failover slots.

Michaël Paquier sent some follow-on patches to the ones now committed that make sure filesystem moves are successful before moving on.

Constantin S. Pan sent in another revision of a patch to speed up GIN build with parallel workers.

Thomas Reiss and Thom Brown traded patches to replace pg_stat_activity.waiting with things more descriptive.

Vik Fearing sent in another revision of a patch to add an idle-in-transaction timeout.

David Fetter sent in another revision of a patch to add weighted statistics (mean, stddev).

David Rowley sent eight more revisions of a patch to enable parallel aggregation.

Peter Geoghegan sent in a patch to refactor speculative insertion into unique indexes.

Craig Ringer sent in two more revisions of a patch to fix an issue where logical decoding slots were not documented to be able to go backwards, but were able to from SQL.

Craig Ringer sent in a patch to fix TAP to initialize $timed_out correctly if passed.

Aleksander Alekseev sent in a patch to fix double variable initializations in policy.c.

Amit Kapila and Robert Haas traded patches to push down target lists below gather nodes.

Fabien COELHO and Ãlvaro Herrera traded patches to add pgbench stats per script, etc.

Andres Freund and Thomas Munro traded patches to fix a performance regression.

Artur Zakirov sent in a patch to allow numbers in email addresses in tsearch.

Dmitry Dolgov sent in two more revisions of a patch to add a function which allows inserting into a JSONB object at an arbitrary point.

Dmitry Ivanov sent in two more revisions of a patch to add phrase search to the tsearch extension.

David Rowley and Haribabu Kommi traded patches for combining aggregates.

Tomas Vondra sent in four more revisions of a patch to add multivariate statistics.

Dilip Kumar and Robert Haas traded patches to help scale relation extension.

Alexander Kuleshov sent in a patch to reduce three macros PG_CMD_PRINTF{1,2,3} to one.

Mark Dilger and Kevin Grittner traded patches to remove gratuitous gender-specific wording from the code.

Marisa Emerson sent in two more revisions of a patch to implement BSD Authentication.

Haribabu Kommi sent in three more revisions of a patch to add pg_hba_lookup().

Stas Kelvich sent in four revisions of a patch to fix a bug in fd.c on OS X.

Tomas Vondra sent in another revision of a patch to improve GROUP BY estimation.

Kevin Grittner sent in another revision of a patch to implement "snapshot too old" configured by time.

Teodor Sigaev sent in two more revisions of a patch to implement index lookups for OR clauses.

Andrew Dunstan sent in two revisions of a patch to add enum support for btree_gi(n|st) supplied extensions.

Tomas Vondra sent in another revision of a patch to fix an issue in pgstat.

Robert Haas sent in a patch to fix an issue with the recently added memory management improvement for external sorts.

Tomas Vondra and Fabien COELHO traded patches to fix outdated docs on pgbench.

Petr Jelínek and Artur Zakirov traded patches to create generic logical WAL messages.

Anastasia Lubennikova sent in another revision of a patch to compress away duplicates in a B-Tree index.

Teodor Sigaev and Emre Hasegeli traded patches to add support for box type in SP-GiST index.

Christoph Berg sent in a patch to create a pg_filedump for 9.5.

Jeff Janes sent in a patch to allow zero weights in pgbench tests.

Magnus Hagander sent in another revision of a patch to implement non-exclusive backups.

Fabien COELHO sent in four more revisions of a patch to extend pgbench expressions with functions.

Alexander Korotkov sent in three more revisions of a patch to move PinBuffer and UnpinBuffer to atomics.

Tomas Vondra sent in another revision of a patch to enable index-only scans with partial indexes.

Yuri Niyazov sent in a patch to improve the pg_upgrade documentation.

Pavel Stěhule sent in another revision of a patch to add a \crosstabview to psql.

Michaël Paquier and Ashutosh Bapat traded patches to fix some issues in the recently added FDW write pushdown patch.

Christian Ullrich sent in two revisions of a patch to fix realm handling in SSPI auth.

Anastasia Lubennikova sent in another revision of a patch to add covering + unique indexes.

Julien Rouhaud sent in two more revisions of a patch to choose parallel degree.

Jeff Janes sent in a patch to fix a typo in the vacuum progress documentation.

Aleksander Alekseev sent in two more revisions of a patch to make PostgreSQL sanitizers-friendly.

Abhijit Menon-Sen sent in a patch to implement ALTER FUNCTION x DEPENDS ON EXTENSION y.

Artur Zakirov sent in another revision of a patch to add fuzzy substring searching with the pg_trgm extension.

Grzegorz Sampolski sent in another revision of a patch to add an rhost entry to PAM auth.

Constantin S. Pan sent in another revision of a patch to speed up GIN index builds with parallel workers.

Craig Ringer sent in another revision of a patch to add timeline following for logical decoding.

Michaël Paquier sent in two revisions of a patch to ensure that all messages from pg_rewind --debug get translated.

Pavan Deolasee sent in two revisions of a patch to fix an issue where pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data.

Craig Ringer sent in two revisions of a patch to add a README for src/backend/replication/logical.

Vitaly Burovoy and Michaël Paquier traded patches to fix an issue in searching path in jsonb_set when walking through JSONB array.

Kyotaro HORIGUCHI sent in another revision of a patch to show dropped users' backends in pg_stat_activity.

Fujii Masao and SAWADA Masahiko traded patches for support for N synchronous standby servers, N > 1.

YUriy Zhuravlev sent in four revisions of a patch to add CINE to PREPARE.

Roma Sokolov and Robert Haas traded patches to fix DROP OPERATOR to reset links to itself on commutator and negator.

Amit Kapila sent in another revision of a patch to speed up CLOG access by increasing CLOG buffers.

Stephen Frost sent in another revision of a patch to cause pg_dump to dump catalog ACLs.

Aleksander Alekseev sent in another revision of a patch to fix lock contention for HASHHDR.mutex.

Daniel Verité sent in another revision of a patch to fix pg_dump and COPY's handling of gigantic lines.

Petr Jediný sent in another revision of a patch to add BRIN to the multicolumn indexes documentation.

Gilles Darold sent in another revision of a patch to add pg_current_logfile().

Kaigai Kouhei sent in another revision of a patch to rework CustomScan serialization/deserialization.

Tom Lane sent in a patch to fix SRF behavior in target lists.

Michaël Paquier sent in another revision of a patch to add VS 2015 support in src/tools/msvc.

Abhijit Menon-Sen sent in three more revisions of a patch to deal with extension dependencies that aren't quite 'e'.

Thomas Munro sent in three more revisions of a patch to add "causal reads."

Petr Jelínek sent in another revision of a patch to add a sequence access method and a gapless sequence implementation using same.

Aleksander Alekseev sent in two revisions of a patch to fix some code duplication in heapam.c.

Dagfinn Ilmari Mannsåker sent in three revisions of a patch to implement ALTER TYPE ... ALTER VALUE .. TO .. for enums.

Robert Haas sent in a patch to measure waste memory for cache alignment correctness.

Thomas Munro sent in a patch to fix a typo in a comment.

par N Bougain le lundi 28 mars 2016 à 20h08

mardi 15 mars 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 13 mars 2016

La première réunion de PostgreSQL Israël aura lieu à Tel Aviv le 7 avril 2016 : http://www.meetup.com/PostgreSQL-Israel/events/229430729/

L'appel à conférenciers du PG Day UK 2016 cours jusqu'au 5 avril 2016. La conférence aura lieu le 5 juillet 2016. Merci de soumettre vos propositions à Simon.Riggs@2ndquadrant.com <Simon AT 2ndquadrant DOT com> ou via le site web : http://www.pgconf.uk/papers

Les conférences pour le PGCon à Ottawa ont été sélectionnés : http://www.pgcon.org/2016/ugly-list-of-great-talks.txt

[ndt: meetup à Nantes le jeudi 17 mars : http://www.meetup.com/fr-FR/PostgreSQL-User-Group-Nantes/]

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en mars

PostgreSQL Local

  • La première conférence PostgreSQL pan-asiatique se tiendra les 17 et 19 mars 2016 à Singapour. Les inscriptions sont ouvertes : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • Les inscriptions pour le PGDay Paris 2016, prévu pour le 31 mars, sont ouvertes : http://www.pgday.paris/registration/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France).
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York City. Les inscriptions sont ouvertes : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. Les candidatures de conférenciers sont encore acceptées : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 14 mars : http://5432meet.us/
  • La PgConf Silicon Valley 2016 aura lieu du 14 au 16 novembre 2016 : http://www.pgconfsv.com/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160314064234.GA25336@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Mop-up for setting minimum Tcl version to 8.4.

Commit e2609323e set the minimum Tcl version we support to 8.4, but I forgot to adjust the documentation to say the same. Some nosing around for other consequences found that the configure script could be simplified slightly as well.

Branch ------ master

Details ------- http://git.postgresql.org/pg/commitdiff/9da70efcbe09954b1006f878d39885a4acf1c534

  • Fix unportable usage of <ctype.h> functions. isdigit(), isspace(), etc are likely to give surprising results if passed a signed char. We should always cast the argument to unsigned char to avoid that. Error in commit d78a7d9c7fa3e9cd, found by buildfarm member gaur. http://git.postgresql.org/pg/commitdiff/cb0ca0c9953aa0614e7b143bd2440a7582a27233
  • Fix not-terribly-safe coding in NIImportOOAffixes() and NIImportAffixes(). There were two places in spell.c that supposed that they could search for a location in a string produced by lowerstr() and then transpose the offset into the original string. But this fails completely if lowerstr() transforms any characters into characters of different byte length, as can happen in Turkish UTF8 for instance. We'd added some comments about this coding in commit 51e78ab4ff328296, but failed to realize that it was not merely confusing but wrong. Coverity complained about this code years ago, but in such an opaque fashion that nobody understood what it was on about. I'm not entirely sure that this issue *is* what it's on about, actually, but perhaps this patch will shut it up -- and in any case the problem is clear. Back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/b3e05097e58051a7816ed734074fd76345687e0c
  • Fix broken definition for function name in pgbench's exprscan.l. As written, this would accept e.g. 123e9 as a function name. Aside from being mildly astonishing, that would come back to haunt us if we ever try to add float constants to the expression syntax. Insist that function names start with letters (or at least non-digits). In passing reset yyline as well as yycol when starting a new expression. This variable is useless since it's used nowhere, but if we're going to have it we should have it act sanely. http://git.postgresql.org/pg/commitdiff/3899caf772c8dec5c79e553c91f8fc248ca686c9
  • Re-fix broken definition for function name in pgbench's exprscan.l. Wups, my first try wasn't quite right either. Too focused on fixing the existing bug, not enough on not introducing new ones. http://git.postgresql.org/pg/commitdiff/94f1adccd36df3ad75d2c257c9ae1ca448f3e4ac
  • Fix backwards test for Windows service-ness in pg_ctl. A thinko in a96761391 caused pg_ctl to get it exactly backwards when deciding whether to report problems to the Windows eventlog or to stderr. Per bug #14001 from Manuel Mathar, who also identified the fix. Like the previous patch, back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/b642e50aea1b966f3b78c49e806b4a2c5497a861
  • Make the upper part of the planner work by generating and comparing Paths. I've been saying we needed to do this for more than five years, and here it finally is. This patch removes the ever-growing tangle of spaghetti logic that grouping_planner() used to use to try to identify the best plan for post-scan/join query steps. Now, there is (nearly) independent consideration of each execution step, and entirely separate construction of Paths to represent each of the possible ways to do that step. We choose the best Path or set of Paths using the same add_path() logic that's been used inside query_planner() for years. In addition, this patch removes the old restriction that subquery_planner() could return only a single Plan. It now returns a RelOptInfo containing a set of Paths, just as query_planner() does, and the parent query level can use each of those Paths as the basis of a SubqueryScanPath at its level. This allows finding some optimizations that we missed before, wherein a subquery was capable of returning presorted data and thereby avoiding a sort in the parent level, making the overall cost cheaper even though delivering sorted output was not the cheapest plan for the subquery in isolation. (A couple of regression test outputs change in consequence of that. However, there is very little change in visible planner behavior overall, because the point of this patch is not to get immediate planning benefits but to create the infrastructure for future improvements.) There is a great deal left to do here. This patch unblocks a lot of planner work that was basically impractical in the old code structure, such as allowing FDWs to implement remote aggregation, or rewriting plan_set_operations() to allow consideration of multiple implementation orders for set operations. (The latter will likely require a full rewrite of plan_set_operations(); what I've done here is only to fix it to return Paths not Plans.) I have also left unfinished some localized refactoring in createplan.c and planner.c, because it was not necessary to get this patch to a working state. Thanks to Robert Haas, David Rowley, and Amit Kapila for review. http://git.postgresql.org/pg/commitdiff/3fc6e2d7f5b652b417fa6937c34de2438d60fa9f
  • Spell "parallel" correctly. Per David Rowley. http://git.postgresql.org/pg/commitdiff/cf8e7b16a5f3e63fe692d042fefc0c9f09a23ebc
  • Fix minor typo in logical-decoding docs. David Rowley http://git.postgresql.org/pg/commitdiff/a93aec4e0f061ad43034d5324b8407a824e54395
  • Finish refactoring make_foo() functions in createplan.c. This patch removes some redundant cost calculations that I left for later cleanup in commit 3fc6e2d7f5b652b4. There's now a uniform policy that the make_foo() convenience functions don't do any cost calculations. Most of their callers copy costs from the source Path node, and for those that don't, the calculation in the make_foo() function wasn't necessarily right anyhow. (make_result() was particularly a mess, as it was serving multiple callers using cost calcs designed for only the first one or two that had ever existed.) Aside from saving a few cycles, this ensures that what EXPLAIN prints matches the costs we used for planning purposes. It does not change any planner decisions, since the decisions are already made. http://git.postgresql.org/pg/commitdiff/8c314b9853c2fbb85c041d4761426f25a9d63972
  • Fix minor thinko in pathification code. I passed the wrong "root" struct to create_pathtarget in build_minmax_path. Since the subroot is a clone of the outer root, this would not cause any serious problems, but it would waste some cycles because set_pathtarget_cost_width would not have access to Var width estimates set up while running query_planner on the subroot. http://git.postgresql.org/pg/commitdiff/61fd218930db53079e5f001dd4ea2fd53afd1b95
  • Improve handling of group-column indexes in GroupingSetsPath. Instead of having planner.c compute a groupColIdx array and store it in GroupingSetsPaths, make create_groupingsets_plan() find the grouping columns by searching in the child plan node's tlist. Although that's probably a bit slower for create_groupingsets_plan(), it's more like the way every other plan node type does this, and it provides positive confirmation that we know which child output columns we're supposed to be grouping on. (Indeed, looking at this now, I'm not at all sure that it wasn't broken before, because create_groupingsets_plan() isn't demanding an exact tlist match from its child node.) Also, this allows substantial simplification in planner.c, because it no longer needs to compute the groupColIdx array at all; no other cases were using it. I'd intended to put off this refactoring until later (like 9.7), but in view of the likely bug fix and the need to rationalize planner.c's tlist handling so we can do something sane with Konstantin Knizhnik's function-evaluation-postponement patch, I think it can't wait. http://git.postgresql.org/pg/commitdiff/9e8b99420fe5f80495ada8dc50aeb7b954b33093
  • Improve handling of pathtargets in planner.c. Refactor so that the internal APIs in planner.c deal in PathTargets not targetlists, and establish a more regular structure for deriving the targets needed for successive steps. There is more that could be done here; calculating the eval costs of each successive target independently is both inefficient and wrong in detail, since we won't actually recompute values available from the input node's tlist. But it's no worse than what happened before the pathification rewrite. In any case this seems like a good starting point for considering how to handle Konstantin Knizhnik's function-evaluation-postponement patch. http://git.postgresql.org/pg/commitdiff/51c0f63e4d76a86b44e87876a6addcfffb01ec28
  • Fix copy-and-pasteo in comment. Wensheng Zhang http://git.postgresql.org/pg/commitdiff/d31f20e2b5a246f276c73134b610ac7a2f34e274
  • Fix incorrect tlist generation in create_gather_plan(). This function is written as though Gather doesn't project; but it does. Even if it did not project, though, we must use build_path_tlist to ensure that the output columns receive correct sortgroupref labeling. Per report from Amit Kapila. http://git.postgresql.org/pg/commitdiff/8776c15c85322612b9bf79daf50f74be71c12e05
  • Fix incorrect handling of NULL index entries in indexed ROW() comparisons. An index search using a row comparison such as ROW(a, b) > ROW('x', 'y') would stop upon reaching a NULL entry in the "b" column, ignoring the fact that there might be non-NULL "b" values associated with later values of "a". This happens because _bt_mark_scankey_required() marks the subsidiary scankey for "b" as required, which is just wrong: it's for a column after the one with the first inequality key (namely "a"), and thus can't be considered a required match. This bit of brain fade dates back to the very beginnings of our support for indexed ROW() comparisons, in 2006. Kind of astonishing that no one came across it before Glen Takahashi, in bug #14010. Back-patch to all supported versions. Note: the given test case doesn't actually fail in unpatched 9.1, evidently because the fix for bug #6278 (i.e., stopping at nulls in either scan direction) is required to make it fail. I'm sure I could devise a case that fails in 9.1 as well, perhaps with something involving making a cursor back up; but it doesn't seem worth the trouble. http://git.postgresql.org/pg/commitdiff/a298a1e06fb0574c898a07761f9f86c2a323919e
  • Remove a couple of useless pstrdup() calls. There's no point in pstrdup'ing the result of TextDatumGetCString, since that's necessarily already a freshly-palloc'd C string. These particular calls are unlikely to be of any consequence performance-wise, but still they're a bad precedent that can confuse future patch authors. Noted by Chapman Flack. http://git.postgresql.org/pg/commitdiff/cc402116ca156babcd3ef941317f462a96277e3a
  • Refactor pull_var_clause's API to make it less tedious to extend. In commit 1d97c19a0f748e94 and later c1d9579dd8bf3c92, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway. http://git.postgresql.org/pg/commitdiff/364a9f47ab363250f62dd2c381c4da435283725a
  • Give pull_var_clause() reject/recurse/return behavior for WindowFuncs too. All along, this function should have treated WindowFuncs in a manner similar to Aggrefs, ie with an option whether or not to recurse into them. By not considering the case, it was always recursing, which is OK for most callers (although I suspect that the case in prepare_sort_from_pathkeys might represent a bug). But now we need return-without-recursing behavior as well. There are also more than a few callers that should never see a WindowFunc, and now we'll get some error checking on that. http://git.postgresql.org/pg/commitdiff/c82c92b111b7b636e80f8a432de10c62011b35b6
  • Minor additional refactoring of planner.c's PathTarget handling. Teach make_group_input_target() and make_window_input_target() to work entirely with the PathTarget representation of tlists, rather than constructing a tlist and immediately deconstructing it into PathTarget format. In itself this only saves a few palloc's; the bigger picture is that it opens the door for sharing cost_qual_eval work across all of planner.c's constructions of PathTargets. I'll come back to that later. In support of this, flesh out tlist.c's infrastructure for PathTargets a bit more. http://git.postgresql.org/pg/commitdiff/49635d7b3e86c0088eadd80db1563a210bc89efd
  • When appropriate, postpone SELECT output expressions till after ORDER BY. It is frequently useful for volatile, set-returning, or expensive functions in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are done. Otherwise, the functions might be executed for every row of the table despite the presence of LIMIT, and/or be executed in an unexpected order. For example, in SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10; it's probably desirable that the nextval() values are ordered the same as x, and that nextval() is not run more than 10 times. In the past, Postgres was inconsistent in this area: you would get the desirable behavior if the ordering were performed via an indexscan, but not if it had to be done by an explicit sort step. Getting the desired behavior reliably required contortions like SELECT x, nextval('seq') FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10; This patch conditionally postpones evaluation of pure-output target expressions (that is, those that are not used as DISTINCT, ORDER BY, or GROUP BY columns) so that they effectively occur after sorting, even if an explicit sort step is necessary. Volatile expressions and set-returning expressions are always postponed, so as to provide consistent semantics. Expensive expressions (costing more than 10 times typical operator cost, which by default would include any user-defined function) are postponed if there is a LIMIT or if there are expressions that must be postponed. We could be more aggressive and postpone any nontrivial expression, but there are costs associated with doing so: it requires an extra Result plan node which adds some overhead, and postponement changes the volume of data going through the sort step, perhaps for the worse. Since we tend not to have very good estimates of the output width of nontrivial expressions, it's hard to have much confidence in our ability to predict whether postponement would increase or decrease the cost of the sort; therefore this patch doesn't attempt to make decisions conditionally on that. Between these factors and a general desire not to change query behavior when there's not a demonstrable benefit, it seems best to be conservative about applying postponement. We might tweak the decision rules in the future, though. Konstantin Knizhnik, heavily rewritten by me http://git.postgresql.org/pg/commitdiff/9118d03a8cca3d97327c56bf89a72e328e454e63
  • Re-export a few of createplan.c's make_xxx() functions. CitusDB is using these and don't wish to redesign their code right now. I am not on board with this being a good idea, or a good precedent, but I lack the energy to fight about it. http://git.postgresql.org/pg/commitdiff/570be1f73f385abb557bda15b718d7aac616cc15
  • Get rid of scribbling on a const variable in psql's print.c. Commit a2dabf0e1dda93c8 had the bright idea that it could modify a "const" global variable if it merely casted away const from a pointer. This does not work on platforms where the compiler puts "const" variables into read-only storage. Depressingly, we evidently have no such platforms in our buildfarm ... an oversight I have now remedied. (The one platform that is known to catch this is recent OS X with -fno-common.) Per report from Chris Ruprecht. Back-patch to 9.5 where the bogus code was introduced. http://git.postgresql.org/pg/commitdiff/fc7a9dfddb073a55a226778acd6a9b3f5ea2e626
  • Widen query numbers-of-tuples-processed counters to uint64. This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me http://git.postgresql.org/pg/commitdiff/23a27b039d94ba359286694831eafe03cd970eef
  • Fix Windows portability issue in 23a27b039d94ba35. _strtoui64() is available in MSVC builds, but apparently not with other Windows toolchains. Thanks to Petr Jelinek for the diagnosis. http://git.postgresql.org/pg/commitdiff/ab737f6ba9fc0a26d32a95b115d5cd0e24a63191
  • Report memory context stats upon out-of-memory in repalloc[_huge]. This longstanding functionality evidently got lost in commit 3d6d1b585524aab6. Noted while studying an OOM report from Jaime Casanova. Backpatch to 9.5 where the bug was introduced. http://git.postgresql.org/pg/commitdiff/4b980167cb5a489346c5e53afb86280a7d59ebc7
  • Fix memory leak in repeated GIN index searches. Commit d88976cfa1302e8d removed this code from ginFreeScanKeys(): if (entry->list) pfree(entry->list); evidently in the belief that that ItemPointer array is allocated in the keyCtx and so would be reclaimed by the following MemoryContextReset. Unfortunately, it isn't and it won't. It'd likely be a good idea for that to become so, but as a simple and back-patchable fix in the meantime, restore this code to ginFreeScanKeys(). Also, add a similar pfree to where startScanEntry() is about to zero out entry->list. I am not sure if there are any code paths where this change prevents a leak today, but it seems like cheap future-proofing. In passing, make the initial allocation of so->entries[] use palloc not palloc0. The code doesn't depend on unused entries being zero; if it did, the array-enlargement code in ginFillScanEntry() would be wrong. So using palloc0 initially can only serve to confuse readers about what the invariant is. Per report from Felipe de Jesús Molina Bravo, via Jaime Casanova in <CAJGNTeMR1ndMU2Thpr8GPDUfiHTV7idELJRFusA5UXUGY1y-eA@mail.gmail.com> http://git.postgresql.org/pg/commitdiff/ab4ff2889d0bccc32467e681546aabdb87de4958

Andres Freund pushed:

  • Fix wrong allocation size in c8f621c43. In c8f621c43 I forgot to account for MAXALIGN when allocating a new tuplebuf in ReorderBufferGetTupleBuf(). That happens to currently not cause active problems on a number of platforms because the affected pointer is already aligned, but others, like ppc and hppa, trigger this in the regression test, due to a debug memset clearing memory. Fix that. Backpatch: 9.4, like the previous commit. http://git.postgresql.org/pg/commitdiff/fd45d16f6212df15821684b231a44448389fb002
  • Further improvements to c8f621c43. Coverity and inspection for the issue addressed in fd45d16f found some questionable code. Specifically coverity noticed that the wrong length was added in ReorderBufferSerializeChange() - without immediate negative consequences as the variable isn't used afterwards. During code-review and testing I noticed that a bit of space was wasted when allocating tuple bufs in several places. Thirdly, the debug memset()s in ReorderBufferGetTupleBuf() reduce the error checking valgrind can do. Backpatch: 9.4, like c8f621c43. http://git.postgresql.org/pg/commitdiff/b63bea5fd3bba4d7a61c3beaba51a06f24b38da6
  • plperl: Correctly handle empty arrays in plperl_ref_from_pg_array. plperl_ref_from_pg_array() didn't consider the case that postgrs arrays can have 0 dimensions (when they're empty) and accessed the first dimension without a check. Fix that by special casing the empty array case. Author: Alex Hunsaker Reported-By: Andres Freund / valgrind / buildfarm animal skink Discussion: 20160308063240.usnzg6bsbjrne667@alap3.anarazel.de Backpatch: 9.1- http://git.postgresql.org/pg/commitdiff/e66197fa2efa8ae0cab1eed6b2257ab4e2134b1e
  • ltree: Zero padding bytes when allocating memory for externally visible data. ltree/ltree_gist/ltxtquery's headers stores data at MAXALIGN alignment, requiring some padding bytes. So far we left these uninitialized. Zero those by using palloc0. Author: Andres Freund Reported-By: Andres Freund / valgrind / buildarm animal skink Backpatch: 9.1- http://git.postgresql.org/pg/commitdiff/7a1d4a2448c34ed4669d67ae4f24c594545f10b5
  • Add valgrind suppressions for bootstrap related code. Author: Andres Freund Backpatch: 9.4, where we started to maintain valgrind suppressions http://git.postgresql.org/pg/commitdiff/5e43bee8307f1f6f87894c9a4bd9f9045f45c064
  • Add valgrind suppressions for python code. Python's allocator does some low-level tricks for efficiency; unfortunately they trigger valgrind errors. Those tricks can be disabled making instrumentation easier; but few people testing postgres will have such a build of python. So add broad suppressions of the resulting errors. See also https://svn.python.org/projects/python/trunk/Misc/README.valgrind This possibly will suppress valid errors, but without it it's basically impossible to use valgrind with plpython code. Author: Andres Freund Backpatch: 9.4, where we started to maintain valgrind suppressions http://git.postgresql.org/pg/commitdiff/2f1f4439306d2793492e49366d5911e48aa2c4b1
  • Introduce durable_rename() and durable_link_or_rename(). Renaming a file using rename(2) is not guaranteed to be durable in face of crashes; especially on filesystems like xfs and ext4 when mounted with data=writeback. To be certain that a rename() atomically replaces the previous file contents in the face of crashes and different filesystems, one has to fsync the old filename, rename the file, fsync the new filename, fsync the containing directory. This sequence is not generally adhered to currently; which exposes us to data loss risks. To avoid having to repeat this arduous sequence, introduce durable_rename(), which wraps all that. Also add durable_link_or_rename(). Several places use link() (with a fallback to rename()) to rename a file, trying to avoid replacing the target file out of paranoia. Some of those rename sequences need to be durable as well. There seems little reason extend several copies of the same logic, so centralize the link() callers. This commit does not yet make use of the new functions; they're used in a followup commit. Author: Michael Paquier, Andres Freund Discussion: 56583BDD.9060302@2ndquadrant.com Backpatch: All supported branches http://git.postgresql.org/pg/commitdiff/606e0f9841b820d826f837bf741a3e5e9cc62fa1
  • Avoid unlikely data-loss scenarios due to rename() without fsync. Renaming a file using rename(2) is not guaranteed to be durable in face of crashes. Use the previously added durable_rename()/durable_link_or_rename() in various places where we previously just renamed files. Most of the changed call sites are arguably not critical, but it seems better to err on the side of too much durability. The most prominent known case where the previously missing fsyncs could cause data loss is crashes at the end of a checkpoint. After the actual checkpoint has been performed, old WAL files are recycled. When they're filled, their contents are fdatasynced, but we did not fsync the containing directory. An OS/hardware crash in an unfortunate moment could then end up leaving that file with its old name, but new content; WAL replay would thus not replay it. Reported-By: Tomas Vondra Author: Michael Paquier, Tomas Vondra, Andres Freund Discussion: 56583BDD.9060302@2ndquadrant.com Backpatch: All supported branches http://git.postgresql.org/pg/commitdiff/1d4a0ab19a7e45aa8b94d7f720d1d9cefb81ec40
  • Blindly try to fix dtrace enabled builds, broken in 9cd00c45. Reported-By: Peter Eisentraut Discussion: 56E2239E.1050607@gmx.net http://git.postgresql.org/pg/commitdiff/c94f0c29cecc7944a14aa645c8a97a7250bf146b
  • Checkpoint sorting and balancing. Up to now checkpoints were written in the order they're in the BufferDescriptors. That's nearly random in a lot of cases, which performs badly on rotating media, but even on SSDs it causes slowdowns. To avoid that, sort checkpoints before writing them out. We currently sort by tablespace, relfilenode, fork and block number. One of the major reasons that previously wasn't done, was fear of imbalance between tablespaces. To address that balance writes between tablespaces. The other prime concern was that the relatively large allocation to sort the buffers in might fail, preventing checkpoints from happening. Thus pre-allocate the required memory in shared memory, at server startup. This particularly makes it more efficient to have checkpoint flushing enabled, because that'll often result in a lot of writes that can be coalesced into one flush. Discussion: alpine.DEB.2.10.1506011320000.28433@sto Author: Fabien Coelho and Andres Freund http://git.postgresql.org/pg/commitdiff/9cd00c457e6a1ebb984167ac556a9961812a683c
  • Allow to trigger kernel writeback after a configurable number of writes. Currently writes to the main data files of postgres all go through the OS page cache. This means that some operating systems can end up collecting a large number of dirty buffers in their respective page caches. When these dirty buffers are flushed to storage rapidly, be it because of fsync(), timeouts, or dirty ratios, latency for other reads and writes can increase massively. This is the primary reason for regular massive stalls observed in real world scenarios and artificial benchmarks; on rotating disks stalls on the order of hundreds of seconds have been observed. On linux it is possible to control this by reducing the global dirty limits significantly, reducing the above problem. But global configuration is rather problematic because it'll affect other applications; also PostgreSQL itself doesn't always generally want this behavior, e.g. for temporary files it's undesirable. Several operating systems allow some control over the kernel page cache. Linux has sync_file_range(2), several posix systems have msync(2) and posix_fadvise(2). sync_file_range(2) is preferable because it requires no special setup, whereas msync() requires the to-be-flushed range to be mmap'ed. For the purpose of flushing dirty data posix_fadvise(2) is the worst alternative, as flushing dirty data is just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages from the page cache. Thus the feature is enabled by default only on linux, but can be enabled on all systems that have any of the above APIs. While desirable and likely possible this patch does not contain an implementation for windows. With the infrastructure added, writes made via checkpointer, bgwriter and normal user backends can be flushed after a configurable number of writes. Each of these sources of writes controlled by a separate GUC, checkpointer_flush_after, bgwriter_flush_after and backend_flush_after respectively; they're separate because the number of flushes that are good are separate, and because the performance considerations of controlled flushing for each of these are different. A later patch will add checkpoint sorting - after that flushes from the ckeckpoint will almost always be desirable. Bgwriter flushes are most of the time going to be random, which are slow on lots of storage hardware. Flushing in backends works well if the storage and bgwriter can keep up, but if not it can have negative consequences. This patch is likely to have negative performance consequences without checkpoint sorting, but unfortunately so has sorting without flush control. Discussion: alpine.DEB.2.10.1506011320000.28433@sto Author: Fabien Coelho and Andres Freund http://git.postgresql.org/pg/commitdiff/428b1d6b29ca599c5700d4bc4f4ce4c5880369bf
  • Include portability/mem.h into fd.c for MAP_FAILED. Buildfarm members gaur and pademelon are old enough not to know about MAP_FAILED; which is used in 428b1d6. Include portability/mem.h to fix; as already done in a bunch of other places. http://git.postgresql.org/pg/commitdiff/e01157500f26342bf4f067a4eb1e45ab9a3cd410

Joe Conway pushed:

Peter Eisentraut pushed:

Robert Haas pushed:

Ãlvaro Herrera pushed:

Simon Riggs pushed:

Magnus Hagander pushed:

  • Avoid crash on old Windows with AVX2-capable CPU for VS2013 builds. The Visual Studio 2013 CRT generates invalid code when it makes a 64-bit build that is later used on a CPU that supports AVX2 instructions using a version of Windows before 7SP1/2008R2SP1. Detect this combination, and in those cases turn off the generation of FMA3, per recommendation from the Visual Studio team. The bug is actually in the CRT shipping with Visual Studio 2013, but Microsoft have stated they're only fixing it in newer major versions. The fix is therefor conditioned specifically on being built with this version of Visual Studio, and not previous or later versions. Author: Christian Ullrich http://git.postgresql.org/pg/commitdiff/9d90388247e093cd9b3ead79954df2ac18bfeb66
  • Refactor receivelog.c parameters. Much cruft had accumulated over time with a large number of parameters passed down between functions very deep. With this refactoring, instead introduce a StreamCtl structure that holds the parameters, and pass around a pointer to this structure instead. This makes it much easier to add or remove fields that are needed deeper down in the implementation without having to modify every function header in the file. Patch by me after much nagging from Andres Reviewed by Craig Ringer and Daniel Gustafsson http://git.postgresql.org/pg/commitdiff/38c83c9b7569378d864d8915e291716b8bec15f2
  • Allow setting sample ratio for auto_explain. New configuration parameter auto_explain.sample_ratio makes it possible to log just a fraction of the queries meeting the configured threshold, to reduce the amount of logging. Author: Craig Ringer and Julien Rouhaud Review: Petr Jelinek http://git.postgresql.org/pg/commitdiff/92f03fe76fe6be683a8b7497579158b8a82b2c25
  • Fix order of MemSet arguments. Noted by Tomas Vondra http://git.postgresql.org/pg/commitdiff/a1aa8b7ea0558620106e25c27d0a70ee4ac9d6a8
  • Rename auto_explain.sample_ratio to sample_rate. Per suggestion from Tomas Vondra Author: Julien Rouhaud http://git.postgresql.org/pg/commitdiff/7a8d8748362d4d8505e320c3eaab4a2c2463e3a6

Teodor Sigaev pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Amit Langote sent in eight more revisions of a patch to implement a vacuum progress checker.

David Rowley and Haribabu Kommi traded patches to implement parallel aggregation.

Peter Geoghegan sent in a patch to avoid incorrectly indicating exclusion constraint wait.

Michaël Paquier and Petr Jelínek traded patches to add support for MSVC 2015.

Fabien COELHO sent in four more revisions of a patch to extend pgbench expressions with functions.

Peter Geoghegan sent in another revision of a patch to use quicksort for every external sort run.

Michaël Paquier sent in four more revisions of a patch to fix a recovery delay.

Robert Haas sent in another revision of a patch to add an Idle In Transaction Session timeout GUC.

Robbie Harwood sent in another revision of a patch to add GSSAPI encryption support.

Ashutosh Bapat sent in another revision of a patch to allow pushing sorted joins down to the FDW.

Pavan Deolasee sent in two more revisions of a patch to reduce the amount of WAL in the 2PC case.

David Rowley sent in two more revisions of a patch to make outer joins faster when the outer side is unique.

Tomas Vondra sent in five more revisions of a patch to implement multivariate statistics.

Dilip Kumar sent in three more revisions of a patch to scale up relation extension.

Thomas Munro sent in four more revisions of a patch to implement "causal reads."

Kyotaro HORIGUCHI sent in another revision of a patch to allow index-only scans with partial indexes.

Fabien COELHO sent in another revision of a patch to fix pgbench duration under rate.

Michaël Paquier sent in another revision of a patch to delay status update to the end of recovery.

David G. Johnston sent in a patch to continue using \pset titles during \watch iterations after the first in psql.

Etsuro Fujita sent in another revision of a patch to make writes faster with the PostgreSQL FDW.

Gilles Darold sent in two revisions of a patch to add a pg_current_logfile() function.

Alexander Korotkov sent in two more revisions of a patch for access method extendability.

Mithun Cy and Amit Kapila traded patches to fix an issue where Explain [Analyze] produces a parallel scan for SELECT INTO TABLE statements. As writes have not yet been parallelized, this is a problem.

Kaigai Kouhei sent in two more revisions of a patch to rework CustomScan [de]serialization.

Alexander Kuleshov sent in a patch to simplify search of end of argv in save_ps_display_args().

Stas Kelvich sent in a patch to add 2PC support to pg_logical.

Pavel Stěhule sent in another revision of a patch to add parse_ident().

Corey Huinker sent in two more revisions of a patch to add generate_series(date, date[, integer]).

Peter Eisentraut sent in a patch to clear OpenSSL error queue before OpenSSL calls.

Alexey Grishchenko sent in two revisions of a patch to fix an endless loop calling PL/Python set returning functions.

Grzegorz Sampolski and Haribabu Kommi traded patches to add rhost to PAM auth.

Thomas Munro sent in three more revisions of a patch to detect SSI conflicts before reporting constraint violations.

Kyotaro HORIGUCHI sent in another revision of a patch to fix a WAL logging issue.

David Steele sent in another revision of a patch to add client log output filtering.

Sherrylyn Branchaw sent in a patch to change error code for hstore syntax error.

Alexander Kuleshov sent in a patch to use MemoryContextAlloc() in the MemoryContextAllocZero() and MemoryContextAllocZeroAligned().

Daniel Verité sent in another revision of a patch to add \crosstabview to psql.

Alexander Korotkov sent in another revision of a patch to implement partial sort.

Dagfinn Ilmari Mannsåker sent in a patch to fix obsolete wording in PL/Perl hv_store_string comment.

Jim Nasby sent in a patch to improve error handling in pltcl.

Jim Nasby sent in another revision of a patch to ensure that configure checks for a valid [1-65535] port.

Alexander Korotkov sent in another revision of a patch to move PinBuffer and UnpinBuffer to atomics.

Jim Nasby sent in another revision of a patch to implement \gexec in psql.

Thomas Munro sent in a patch to make pg_stat_get_progress_info() strict.

David Rowley sent in two more revisions of a patch to help implement combining aggregates.

Noah Misch and Tomas Vondra traded patches to split stats file per database.

par N Bougain le mardi 15 mars 2016 à 22h53

dimanche 13 mars 2016

Guillaume Lelarge

Fin de la traduction du manuel de la 9.5

Beaucoup de retard pour cette fois, mais malgré tout, on a fini la traduction du manuel 9.5 de PostgreSQL. Évidemment, tous les manuels ont aussi été mis à jour avec les dernières versions mineures.

N'hésitez pas à me remonter tout problème sur la traduction.

De même, j'ai pratiquement terminé la traduction des applications. Elle devrait être disponible pour la version 9.5.2 (pas de date encore connue).

par Guillaume Lelarge le dimanche 13 mars 2016 à 10h41

vendredi 11 mars 2016

Nicolas Gollet

PostgreSQL Studio, un GUI web pour PostgreSQL

PostgreSQL Studio est une interface graphique orientée web pour PostgreSQL compatible avec la dernière version de PostgreSQL (la 9.5 avec par exemple la prise en charge des RLS). Ce produit est plus fait pour les Dev que pour les DBA/Admin.

Pour la liste des fonctionnalités, rendez-vous sur le site officiel

pgstudio s'installe facilement dans un conteneur web de servlets comme Tomcat, Jetty, JBoss...

Pour que vous puissiez vous même vous faire une idée, j'ai mis en place la dernière version 2.0 sur la plateforme "cloud" de RedHat dans un conteneur Jboss/Java7 : https://pgstudio-ngpe.rhcloud.com/pgstudio/

Le produit n'est pas encore à mon avis 100% mature, mais il est en bonne voie...

Bon test :)

par Nicolas GOLLET le vendredi 11 mars 2016 à 17h53

mardi 8 mars 2016

Sébastien Lardière

Dates à retenir

Trois dates à retenir autour de PostgreSQL : 17 mars, à Nantes, un premier meetup, dans lequel j'évoquerai les nouveautés de PostgreSQL 9.5. 31 mars, à Paris, où j'essayerai de remonter le fil du temps de bases de données. 31 mai, à Lille, où je plongerai dans les structures du stockage de PostgreSQL. Ces trois dates sont l'occasion... Lire Dates à retenir

par Sébastien Lardière le mardi 8 mars 2016 à 08h30

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 6 mars 2016

La PgConf Silicon Valley 2016 aura lieu du 14 au 16 novembre 2016 : http://www.pgconfsv.com/

[ndt: meetup à Nantes le jeudi 17 mars : http://www.meetup.com/fr-FR/PostgreSQL-User-Group-Nantes/]

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en mars

PostgreSQL Local

  • La première conférence PostgreSQL pan-asiatique se tiendra les 17 et 19 mars 2016 à Singapour. Les inscriptions sont ouvertes : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • Les inscriptions pour le PGDay Paris 2016, prévu pour le 31 mars, sont ouvertes : http://www.pgday.paris/registration/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France).
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York City. Les inscriptions sont ouvertes : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. Les candidatures de conférenciers sont encore acceptées : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 14 mars : http://5432meet.us/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160306234210.GE28543@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Avoid multiple free_struct_lconv() calls on same data. A failure partway through PGLC_localeconv() led to a situation where the next call would call free_struct_lconv() a second time, leading to free() on already-freed strings, typically leading to a core dump. Add a flag to remember whether we need to do that. Per report from Thom Brown. His example case only provokes the failure as far back as 9.4, but nonetheless this code is obviously broken, so back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/907e4dd2b104bdcb4af042065a92fcd73d5790ec
  • Fix build under OPTIMIZER_DEBUG. In commit 19a541143a09c067 I replaced RelOptInfo.width with RelOptInfo.reltarget.width, but I missed updating debug_print_rel() for that because it's not compiled by default. Reported by Salvador Fandino, patch by Michael Paquier. http://git.postgresql.org/pg/commitdiff/05893712cc9950b7c4c312aa9d3d444375bf15a2
  • Remove useless unary plus. It's harmless, but might confuse readers. Seems to have been introduced in 6bc8ef0b7f1f1df3. Back-patch, just to avoid cosmetic cross-branch differences. Amit Langote http://git.postgresql.org/pg/commitdiff/c110678a47aac87c661785a70061e160cd17713d
  • Improve error message for rejecting RETURNING clauses with dropped columns. This error message was written with only ON SELECT rules in mind, but since then we also made RETURNING-clause targetlists go through the same logic. This means that you got a rather off-topic error message if you tried to add a rule with RETURNING to a table having dropped columns. Ideally we'd just support that, but some preliminary investigation says that it might be a significant amount of work. Seeing that Nicklas Avén's complaint is the first one we've gotten about this in the ten years or so that the code's been like that, I'm unwilling to put much time into it. Instead, improve the error report by issuing a different message for RETURNING cases, and revise the associated comment based on this investigation. Discussion: 1456176604.17219.9.camel@jordogskog.no http://git.postgresql.org/pg/commitdiff/8d8ff5f7db7d58240fac7d5f620308c91485b253
  • Suppress scary-looking log messages from async-notify isolation test. I noticed that the async-notify test results in log messages like these: LOG: could not send data to client: Broken pipe FATAL: connection to client lost This is because it unceremoniously disconnects a client session that is about to have some NOTIFY messages delivered to it. Such log messages during a regression test might well cause people to go looking for a problem that doesn't really exist (it did cause me to waste some time that way). We can shut it up by adding an UNLISTEN command to session teardown. Patch HEAD only; this doesn't seem significant enough to back-patch. http://git.postgresql.org/pg/commitdiff/3d523564c53ab8f35edf4d20627f0a375a17624d
  • Improve coverage of pltcl regression tests. Test composite-type arguments and the argisnull and spi_lastoid Tcl commmands. This stuff was not covered before, but needs to be exercised since the upcoming Tcl object-conversion patch changes these code paths (and broke at least one of them). http://git.postgresql.org/pg/commitdiff/68c521eb92c3515e3306f51a7fd3f32d16c97524
  • Fix TAP tests for older Perls. Commit 7132810c (Retain tempdirs for failed tests) used Test::More's is_passing method, but that was added in Test::More 0.89_01 which is sometime later than Perl 5.10.1. Popular platforms such as RHEL6 don't have that, nevermind some of our older dinosaurs. Do it the hard way. Michael Paquier, based on research by Craig Ringer http://git.postgresql.org/pg/commitdiff/3b8d7215533ed3128b1b9174eae830d70c0453d0
  • Convert PL/Tcl to use Tcl's "object" interfaces. The original implementation of Tcl was all strings, but they improved performance significantly by introducing typed "objects" (integers, lists, code, etc). It's past time we made use of that; that happened in Tcl 8.0 which was released in 1997. This patch also modernizes some of the error-reporting code, which may cause small changes in the spelling of complaints about bad calls to PL/Tcl-provided commands. Jim Nasby and Karl Lehenbauer, reviewed by Victor Wagner http://git.postgresql.org/pg/commitdiff/287822068246a6ae30bb2c7191de727672ae6328
  • Make PL/Tcl require Tcl 8.4 or later. As of commit 287822068246a6ae30bb2c7191de727672ae6328, PL/Tcl will not compile against pre-8.0 Tcl, whereas it used to work (more or less anyway) with quite prehistoric versions. As long as we're moving these goalposts, let's reinstall them at someplace that has some thought behind it. This commit sets the minimum allowed Tcl version at 8.4, and rips out some bits of compatibility cruft that are in consequence no longer needed. Reasons for requiring 8.4 include: * 8.4 was released in 2002; there seems little reason to believe that anyone would want to use older versions with Postgres 9.6+. * We have no buildfarm members testing anything older than 8.4, and thus no way to know if it's broken. * We need at least 8.1 to allow enforcement of database encoding security (8.1 standardized Tcl on using UTF8 internally, before that it was pretty unpredictable). * Some versions between 8.1 and 8.4 allowed the backend to become multithreaded, which is disastrous. We need at least 8.4 to be able to disable the Tcl notifier subsystem to prevent that. A small side benefit is that we can make the code more readable by doing s/CONST84/const/g. http://git.postgresql.org/pg/commitdiff/e2609323eb58ee273ac03a66343447e6a56154d5
  • Fix PL/Tcl's encoding conversion logic. PL/Tcl appears to contain logic to convert strings between the database encoding and UTF8, which is the only encoding modern Tcl will deal with. However, that code has been disabled since commit 034895125d648b86, which made it "#if defined(UNICODE_CONVERSION)" and neglected to provide any way for that symbol to become defined. That might have been all right back in 2001, but these days we take a dim view of allowing strings with incorrect encoding into the database. Remove the conditional compilation, fix warnings about signed/unsigned char conversions, clean up assorted places that didn't bother with conversions. (Notably, there were lots of assumptions that database table and field names didn't need conversion...) Add a regression test based on plpython_unicode. It's not terribly thorough, but better than no test at all. http://git.postgresql.org/pg/commitdiff/c8c7c93de8e116d802eddfd8821f8f77588aee00
  • Create stub functions to support pg_upgrade of old contrib/tsearch2. Commits 9ff60273e35cad6e and dbe2328959e12701 adjusted the declarations of some core functions referenced by contrib/tsearch2's install script, forgetting that in a pg_upgrade situation, we'll be trying to restore operator class definitions that reference the old signatures. We've hit this problem before; solve it in the same way as before, namely by installing stub functions that have the expected signature and just invoke the correct function. Per report from Jeff Janes. (Someday we ought to stop supporting contrib/tsearch2, but I'm not sure today is that day.) http://git.postgresql.org/pg/commitdiff/eb43e851d6b3fa0c4516efcfdf29a183f7717a43
  • Fix json_to_record() bug with nested objects. A thinko concerning nesting depth caused json_to_record() to produce bogus output if a field of its input object contained a sub-object with a field name matching one of the requested output column names. Per bug #13996 from Johann Visagie. I added a regression test case based on his example, plus parallel tests for json_to_recordset, jsonb_to_record, jsonb_to_recordset. The latter three do not exhibit the same bug (which suggests that we may be missing some opportunities to share code...) but testing seems like a good idea in any case. Back-patch to 9.4 where these functions were introduced. http://git.postgresql.org/pg/commitdiff/a9d199f6d3b998929cdb8e8aa61e5cd8db9b220f
  • Make stats regression test robust in the face of parallel query. Historically, the wait_for_stats() function in this test has simply checked for a report of an indexscan on tenk2, corresponding to the last command issued before we expect stats updates to appear. However, with parallel query that indexscan could be done by a parallel worker that will emit its stats counters to the collector before the session's main backend does (a full second before, in fact, thanks to the "pg_sleep(1.0)" added by commit 957d08c81f9cc277). That leaves a sizable window in which an autovacuum-triggered write of the stats files would present a state in which the indexscan on tenk2 appears to have been done, but none of the write updates performed by the test have been. This is evidently the explanation for intermittent failures seen by me and on buildfarm member mandrill. To fix, we should check separately for both the tenk2 seqscan and indexscan counts, since those might be reported by different processes that could be delayed arbitrarily on an overloaded test machine. And we need to check for at least one update-related count. If we ever allow parallel workers to do writes, this will get even more complicated ... but in view of all the other hard problems that will entail, I don't feel a need to solve this one today. Per research by Rahila Syed and myself; part of this patch is Rahila's. http://git.postgresql.org/pg/commitdiff/60690a6fe8351995b1eeb9a53f2b634c3bce3a3d

Dean Rasheed pushed:

  • Fix incorrect varlevelsup in security_barrier_replace_vars(). When converting an RTE with securityQuals into a security barrier subquery RTE, ensure that the Vars in the new subquery's targetlist all have varlevelsup = 0 so that they correctly refer to the underlying base relation being wrapped. The original code was creating new Vars by copying them from existing Vars referencing the base relation found elsewhere in the query, but failed to account for the fact that such Vars could come from sublink subqueries, and hence have varlevelsup > 0. In practice it looks like this could only happen with nested security barrier views, where the outer view has a WHERE clause containing a correlated subquery, due to the order in which the Vars are processed. Bug: #13988 Reported-by: Adam Guthrie Backpatch-to: 9.4, where updatable SB views were introduced http://git.postgresql.org/pg/commitdiff/41fedc24626696fdf55d0c43131d91757dbe1c66

Ãlvaro Herrera pushed:

Peter Eisentraut pushed:

Robert Haas pushed:

Andres Freund pushed:

  • logical decoding: fix decoding of a commit's commit time. When adding replication origins in 5aa235042, I somehow managed to set the timestamp of decoded transactions to InvalidXLogRecptr when decoding one made without a replication origin. Fix that, and the wrong type of the new commit_time variable. This didn't trigger a regression test failure because we explicitly don't show commit timestamps in the regression tests, as they obviously are variable. Add a test that checks that a decoded commit's timestamp is within minutes of NOW() from before the commit. Reported-By: Weiping Qu Diagnosed-By: Artur Zakirov Discussion: 56D4197E.9050706@informatik.uni-kl.de, 56D42918.1010108@postgrespro.ru Backpatch: 9.5, where 5aa235042 originates. http://git.postgresql.org/pg/commitdiff/7c17aac69dcae610b08c5965161151cd282f16bc
  • Force synchronous_commit=on in test_decoding's concurrent_ddl_dml.spec. Otherwise running installcheck-force on a server with synchronous_commit=off will result in the tests failing. All the other tests already do so... Backpatch: 9.4, where logical decoding was added http://git.postgresql.org/pg/commitdiff/1986c3c440151b056877b446e7d9c2861906aa26
  • logical decoding: old/newtuple in spooled UPDATE changes was switched around. Somehow I managed to flip the order of restoring old & new tuples when de-spooling a change in a large transaction from disk. This happens to only take effect when a change is spooled to disk which has old/new versions of the tuple. That only is the case for UPDATEs where he primary key changed or where replica identity is changed to FULL. The tests didn't catch this because either spooled updates, or updates that changed primary keys, were tested; not both at the same time. Found while adding tests for the following commit. Backpatch: 9.4, where logical decoding was added http://git.postgresql.org/pg/commitdiff/0bda14d54cf24dedcd2011559a53cc62702e421b
  • logical decoding: Fix handling of large old tuples with replica identity full. When decoding the old version of an UPDATE or DELETE change, and if that tuple was bigger than MaxHeapTupleSize, we either Assert'ed out, or failed in more subtle ways in non-assert builds. Normally individual tuples aren't bigger than MaxHeapTupleSize, with big datums toasted. But that's not the case for the old version of a tuple for logical decoding; the replica identity is logged as one piece. With the default replica identity btree limits that to small tuples, but that's not the case for FULL. Change the tuple buffer infrastructure to separate allocate over-large tuples, instead of always going through the slab cache. This unfortunately requires changing the ReorderBufferTupleBuf definition, we need to store the allocated size someplace. To avoid requiring output plugins to recompile, don't store HeapTupleHeaderData directly after HeapTupleData, but point to it via t_data; that leaves rooms for the allocated size. As there's no reason for an output plugin to look at ReorderBufferTupleBuf->t_data.header, remove the field. It was just a minor convenience having it directly accessible. Reported-By: Adam DratwiÅ„ski Discussion: CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com http://git.postgresql.org/pg/commitdiff/c8f621c43a599b35dc004ee09627bf4688cbbb84
  • logical decoding: Tell reorderbuffer about all xids. Logical decoding's reorderbuffer keeps transactions in an LSN ordered list for efficiency. To make that's efficiently possible upper-level xids are forced to be logged before nested subtransaction xids. That only works though if these records are all looked at: Unfortunately we didn't do so for e.g. row level locks, which are otherwise uninteresting for logical decoding. This could lead to errors like: "ERROR: subxact logged without previous toplevel record". It's not sufficient to just look at row locking records, the xid could appear first due to a lot of other types of records (which will trigger the transaction to be marked logged with MarkCurrentTransactionIdLoggedIfAny). So invent infrastructure to tell reorderbuffer about xids seen, when they'd otherwise not pass through reorderbuffer.c. Reported-By: Jarred Ward Bug: #13844 Discussion: 20160105033249.1087.66040@wrigleys.postgresql.org Backpatch: 9.4, where logical decoding was added http://git.postgresql.org/pg/commitdiff/d9e903f3cbbd00c7ba7d4974e6852c3d2cbf4447

Magnus Hagander pushed:

Simon Riggs pushed:

Teodor Sigaev pushed:

Fujii Masao pushed:

  • Ignore recovery_min_apply_delay until recovery has reached consistent state. Previously recovery_min_apply_delay was applied even before recovery had reached consistency. This could cause us to wait a long time unexpectedly for read-only connections to be allowed. It's problematic because the standby was useless during that wait time. This patch changes recovery_min_apply_delay so that it's applied once the database has reached the consistent state. That is, even if the delay is set, the standby tries to replay WAL records as fast as possible until it has reached consistency. Author: Michael Paquier Reviewed-By: Julien Rouhaud Reported-By: Greg Clough Backpatch: 9.4, where recovery_min_apply_delay was added Bug: #13770 Discussion: http://www.postgresql.org/message-id/20151111155006.2644.84564@wrigleys.postgresql.org http://git.postgresql.org/pg/commitdiff/d34794f7d5566effd342dd0ebaca3de3b48656f0

Joe Conway pushed:

  • Expose control file data via SQL accessible functions. Add four new SQL accessible functions: pg_control_system(), pg_control_checkpoint(), pg_control_recovery(), and pg_control_init() which expose a subset of the control file data. Along the way move the code to read and validate the control file to src/common, where it can be shared by the new backend functions and the original pg_controldata frontend program. Patch by me, significant input, testing, and review by Michael Paquier. http://git.postgresql.org/pg/commitdiff/dc7d70ea05deca9dfc6a25043d406b57cc8f6c30

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Peter Eisentraut sent in another revision of a patch add pg_ctl promote wait.

Thomas Munro sent in two more revisions of a patch to add 'causal reads.'

Joe Conway and Pavel Stěhule traded patches to add a SET ROLE hook.

Alexander Korotkov sent in another revision of a patch to implement access method extensibility.

Kaigai Kouhei sent in a patch to rework the serialization/deserialization interface for CustomScan.

Michaël Paquier sent in a patch to fix a bug where setting OPTIMIZER_DEBUG broke the compilation.

Teodor Sigaev sent in another revision of a patch to make OR use indexes.

Dmitry Dolgov sent in another revision of a patch to add a jsonb_insert() function.

Petr Jelínek sent in another revision of a patch to implement generic logical WAL messages.

Peter Geoghegan sent in a patch to add amcheck, a B-Tree integrity checking tool.

Peter Eisentraut sent in another revision of a patch to remove the "archive" WAL level.

Abhijit Menon-Sen sent in a patch to add a DEPENDENCY_AUTO_EXTENSION dependency type.

Fabrízio de Royes Mello sent in a patch to reduce lock levels others reloptions in ALTER TABLE.

Iacob Catalin and Pavel Stěhule traded patches to implement ereport in PL/PythonU.

Stephen Frost sent in a patch to make pg_dump dump ACLs for pg_catalog objects.

Ãlvaro Herrera sent in a patch to fix pg_dump and copy for huge (~1 GB) lines.

Kevin Grittner sent in two more revisions of a patch to implement "snapshot too old" configured by time.

Michaël Paquier sent in a patch to fix OOM error handling in COPY protocol of libpq.

Peter Eisentraut sent in a patch to fix the jumble that is the state of the pg_resetxlog documentation.

Pavan Deolasee sent in a patch to WAL log only the necessary part of 2PC GID.

Stephen Frost sent in another revision of a patch to implement default roles.

Julien Rouhaud sent in two more revisions of a patch to add hooks to autovacuum and add a pg_stat_autovacuum to use same.

Dilip Kumar sent in two more revisions of a patch to help scale relation extension.

Alexander Korotkov sent in another revision of a patch to move PinBuffer and UnpinBuffer to atomics.

Alexander Korotkov sent in another revision of a patch to implement partial sort.

Roma Sokolov sent in another revision of a patch to fix DROP OPERATOR to reset links to itself on commutator and negator.

Anastasia Lubennikova sent in another revision of a patch to add covering + unique indexes.

Craig Ringer sent in three more revisions of a patch to allow logical slots to follow timeline switches.

Etsuro Fujita sent in another revision of a patch to improve write performance in the PostgreSQL FDW.

Michaël Paquier sent in another revision of a patch to fix handling of --enable-tap-tests in MSVC scripts.

Robert Haas sent in a patch to add a contrib module to examine the visibility map.

Tomas Vondra sent in another revision of a patch to add multivariate statistics for query planning.

Ãlvaro Herrera sent in a patch to fix the omission of BRIN from the CREATE OPERATOR CLASS documentation.

Tomas Vondra sent in a patch to check DBEntry,stats_timestamp in pgstat_recv_inquiry() and ignore requests that are already resolved by the last write.

Haribabu Kommi sent in another revision of a patch to add a pg_hba_lookup() function.

Emre Hasegeli sent in a patch to add SP-GiST support for inet datatypes.

Alexander Shulgin sent in another revision of a patch to account for NULLs in ANALYZE more strictly and try to account for skewed distributions in ANALYZE.

Pavel Stěhule sent in another revision of a patch to add ELEMENT OF to PL/pgsql declarations.

Mithun Cy sent in another revision of a patch to cache data in GetSnapshotData.

Kyotaro HORIGUCHI sent in a patch to allow :: casts to tab complete in psql.

SAWADA Masahiko sent in two more revisions of a patch to allow N>1 synchronous standbys.

Robert Haas sent in another revision of a patch to fix an issue that manifested with the postgres_fdw in force_parallel_mode on ppc.

Amit Kapila and Thom Brown traded patches to replace pg_stat_activity.waiting with something more descriptive.

Michaël Paquier sent in two more revisions of a patch to support VS 2015.

David Rowley and Haribabu Kommi traded patches to implement parallel aggregate.

Fabien COELHO sent in three more revisions of a patch to add pgbench stats per script, etc.

Petr Jelínek sent in another revision of a patch to add a sequence access method.

Fabien COELHO sent in another revision of a patch to extend pgbench expressions with functions.

Amit Langote sent in two more revisions of a patch to implement a vacuum progress checker.

Tom Lane sent in two more revisions of a patch to path-ify the upper planner.

Michaël Paquier sent in another revision of a patch to ensure that xlog fsync happens in a way that does not lose data.

Guillaume Lelarge sent in a patch to fix a typo in the psql documentation.

par N Bougain le mardi 8 mars 2016 à 00h32

mardi 1 mars 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 28 février 2016

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en février

PostgreSQL Local

  • La première conférence PostgreSQL pan-asiatique se tiendra les 17 et 19 mars 2016 à Singapour. Les inscriptions sont ouvertes : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • Les inscriptions pour le PGDay Paris 2016, prévu pour le 31 mars, sont ouvertes : http://www.pgday.paris/registration/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France). L'appel à conférenciers court jusqu'au 29 février à l'adresse call-for-paper AT postgresql-sessions POINT org.
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York City. Les inscriptions sont ouvertes : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. Les candidatures de conférenciers sont encore acceptées : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 28 février : http://5432meet.us/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160228235746.GA1232@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Andres Freund pushed:

  • Fix wrong keysize in PrivateRefCountHash creation. In 4b4b680c3 I accidentally used sizeof(PrivateRefCountArray) instead of sizeof(PrivateRefCountEntry) when creating the refcount overflow hashtable. As the former is bigger than the latter, this luckily only resulted in a slightly increased memory usage when many buffers are pinned in a backend. Reported-By: Takashi Horikawa Discussion: 73FA3881462C614096F815F75628AFCD035A48C3@BPXM01GP.gisp.nec.co.jp Backpatch: 9.5, where thew new ref count infrastructure was introduced http://git.postgresql.org/pg/commitdiff/ea56b06cf77a6932a74f9d4ec6c950a333d1527d

Tom Lane pushed:

  • Remove redundant PGPROC.lockGroupLeaderIdentifier field. We don't really need this field, because it's either zero or redundant with PGPROC.pid. The use of zero to mark "not a group leader" is not necessary since we can just as well test whether lockGroupLeader is NULL. This does not save very much, either as to code or data, but the simplification seems worthwhile anyway. http://git.postgresql.org/pg/commitdiff/73bf8715aa7430bd003516bde448507fbe789c05
  • Create a function to reliably identify which sessions block which others. This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else. http://git.postgresql.org/pg/commitdiff/52f5d578d6c29bf254e93c69043b817d4047ca67

Noah Misch pushed:

Robert Haas pushed:

  • Enable parallelism for prepared statements and extended query protocol. Parallel query can't handle running a query only partially rather than to completion. However, there seems to be no way to run a statement prepared via SQL PREPARE other than to completion, so we can enable it there without a problem. The situation is more complicated for the extend query protocol. libpq seems to provide no way to send an Execute message with a non-zero rowcount, but some other client might. If that happens, and a parallel plan was chosen, we'll execute the parallel plan without using any workers, which may be somewhat inefficient but should still work. Hopefully this won't be a problem; users can always set max_parallel_degree=0 to avoid choosing parallel plans in the first place. Amit Kapila, reviewed by me. http://git.postgresql.org/pg/commitdiff/57a6a72b6bc98f3003e87bc31de4b9c2c89fe019
  • Add new FDW API to test for parallel-safety. This is basically a bug fix; the old code assumes that a ForeignScan is always parallel-safe, but for postgres_fdw, for example, this is definitely false. It should be true for file_fdw, though, since a worker can read a file from the filesystem just as well as any other backend process. Original patch by Thomas Munro. Documentation, and changes to the comments, by me. http://git.postgresql.org/pg/commitdiff/35746bc348b6bf1f690fe17f4f80cfb68e22f504
  • On second thought, disable parallelism for prepared statements. CREATE TABLE .. AS EXECUTE can turn an apparently read-only query into a write operation, which parallel query can't handle. It's a bit of a shame that requires us to avoid parallel query for queries prepared via PREPARE in all cases, but for right now it does. http://git.postgresql.org/pg/commitdiff/7bea19d0a9d3e6975418ffe685fb510bd31ab434
  • Respect TEMP_CONFIG when running contrib regression tests. Thomas Munro http://git.postgresql.org/pg/commitdiff/9117985b6ba9beda4f280f596035649fc23b6233

Ãlvaro Herrera pushed:

Andrew Dunstan pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Amit Kapila sent in another revision of a patch to extend pg_stat_activity with wait_type_event.

Corey Huinker sent in another revision of a patch to add \gexec to psql.

Rushabh Lathia sent in another revision of a patch to help fix some slowness in FDW DML.

Michaël Paquier sent in another revision of a patch to add new authentication methods with SCRAM as one example.

Vitaly Burovoy sent in a patch to fix handling of negative years.

Ashutosh Bapat sent in another revision of a patch to push sorted joins down to FDWs.

Fujii Masao sent in a patch to add tab completion in psql to CREATE USER MAPPING.

Michaël Paquier sent in a pair of patches, one which extends XLogInsert() with an extra argument for flags, the other which introduces XLogInsertExtended with this extra argument and leaves XLogInsert() alone.

Michaël Paquier sent in another revision of a patch to fix a potential data loss bug on ext4 filesystems.

Craig Ringer sent in another revision of a patch to implement failover slots.

Tomas Vondra and Mark Dilger traded patches to improve GROUP BY estimation.

Tomas Vondra and Kyotaro HORIGUCHI traded patches to allow index-only scans with partial indexes.

Jim Nasby sent in another revision of a patch to convert PL/Tcl from strings to objects.

Thomas Munro sent in two more revisions of a patch to make the PostgreSQL in parallel mode safer on PPC.

Teodor Sigaev sent in two more revisions of a patch to fix a GIN corruption bug.

Vitaly Burovoy sent in a patch to fix some overflows in timstamp[tz].

Iacob Catalin and Pavel Stěhule traded patches to add an ereport function to PL/PythonU.

Mithun Cy sent in another revision of a patch to cache data in GetSnapshotData().

Kyotaro HORIGUCHI and SAWADA Masahiko traded patches to support N>1 synchronous standby servers.

Petr Jelínek sent in another revision of a patch to add generic WAL messages.

Julien Rouhaud sent in a patch to ensure that the CREATE OPERATOR CLASS documentation mentions that BRIN indexes also support the STORAGE parameter.

Kyotaro HORIGUCHI sent in a patch to fix wrong comments for PQmblen() and PQdsplen().

Kyotaro HORIGUCHI sent in a patch to fix identifier completion with multibyte characters.

Peter Eisentraut sent in a patch to add table qualifications to some tags in pg_dump.

Kyotaro HORIGUCHI sent in another revision of a patch to add "IF [NOT] EXISTS" support to psql's tab completion.

Michaël Paquier sent in four more revisions of a patch to add new regression tests for recovery, etc.

Peter Eisentraut sent in a patch to introduce new configuration parameters syslog_sequence_numbers and syslog_split_lines.

Amit Kapila sent in another revision of a patch to speed up CLOG access.

Vitaly Burovoy sent in a patch to allow infinite values in to_timestamp.

Pavel Stěhule sent in another revision of a patch to add a raw format to COPY.

Vinayak Pokure sent in another revision of a patch to add a vacuum progress checker.

Roma Sokolov sent in two revisions of a patch to fix DROP OPERATOR to reset links to itself on commutator and negator.

Ivan Kartyshov sent in two revisions of a patch to add a pg_oldest_xlog_location() function.

Simon Riggs sent in two revisions of a patch to fix an issue with relcache invalidation on a physical replica.

Amit Langote sent in a patch to fix a typo in src/backend/utils/init/postinit.c.

Joe Conway sent in two more revisions of a patch to add control data functions.

Tom Lane sent in a WIP patch to path-ify the upper planner. And there was much rejoicing.

Jim Nasby sent in a patch to improve error handling in PL/Tcl.

par N Bougain le mardi 1 mars 2016 à 00h34

lundi 22 février 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 21 février 2016

Les inscriptions pour le PGDay Paris 2016, prévu pour le 31 mars, sont ouvertes : http://www.pgday.paris/registration/

Les détails pour les inscriptions et le mécénat des pgDay Asia ont été mis à jour : http://2016.pgday.asia/sponsorship.html http://2016.pgday.asia/index.html#registration

[ndt: Meetup à Toulouse ce mardi 23 à midi : http://www.meetup.com/fr-FR/PostgreSQL-User-Group-Toulouse/events/228604600/]

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en février

PostgreSQL Local

  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016 : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 17 et 19 mars 2016 à Singapour. Les inscriptions sont ouvertes : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France). L'appel à conférenciers court jusqu'au 29 février à l'adresse call-for-paper AT postgresql-sessions POINT org.
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York City. Les inscriptions sont ouvertes : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. Les candidatures de conférenciers sont encore acceptées : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 28 février : http://5432meet.us/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160222002955.GC10490@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Noah Misch pushed:

Magnus Hagander pushed:

Fujii Masao pushed:

  • Make concurrent refresh check early that there is a unique index on matview. In REFRESH MATERIALIZED VIEW command, CONCURRENTLY option is only allowed if there is at least one unique index with no WHERE clause on one or more columns of the matview. Previously, concurrent refresh checked the existence of a unique index on the matview after filling the data to new snapshot, i.e., after calling refresh_matview_datafill(). So, when there was no unique index, we could need to wait a long time before we detected that and got the error. It was a waste of time. To eliminate such wasting time, this commit changes concurrent refresh so that it checks the existence of a unique index at the beginning of the refresh operation, i.e., before starting any time-consuming jobs. If CONCURRENTLY option is not allowed due to lack of a unique index, concurrent refresh can immediately detect it and emit an error. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Fujii Masao http://git.postgresql.org/pg/commitdiff/31b6606c48edf7c008ffe91907c080404a8c8046
  • Correct the formulas for System V IPC parameters SEMMNI and SEMMNS in docs. In runtime.sgml, the old formulas for calculating the reasonable values of SEMMNI and SEMMNS were incorrect. They have forgotten to count the number of semaphores which both the checkpointer process (introduced in 9.2) and the background worker processes (introduced in 9.3) need. This commit fixes those formulas so that they count the number of semaphores which the checkpointer process and the background worker processes need. Report and patch by Kyotaro Horiguchi. Only the patch for 9.3 was modified by me. Back-patch to 9.2 where the checkpointer process was added and the number of needed semaphores was increased. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Backpatch: 9.2 Discussion: http://www.postgresql.org/message-id/20160203.125119.66820697.horiguchi.kyotaro@lab.ntt.co.jp http://git.postgresql.org/pg/commitdiff/597f7e3a6ec393cf9ff3e11552faf69ff0ab652b

Joe Conway pushed:

Andres Freund pushed:

  • Allow SetHintBits() to succeed if the buffer's LSN is new enough. Previously we only allowed SetHintBits() to succeed if the commit LSN of the last transaction touching the page has already been flushed to disk. We can't generally change the LSN of the page, because we don't necessarily have the required locks on the page. But the required LSN interlock does not mean the commit record has to be flushed immediately, it just requires that the commit record will be flushed before the page is written out. Therefore if the buffer LSN is newer than the commit LSN, the hint bit can be safely set. In a number of scenarios (e.g. pgbench) this noticeably increases the number of hint bits are set. But more importantly it also keeps the success rate up when flushing WAL less frequently. That was the original reason for commit 4de82f7d7, which has negative performance consequences in a number of scenarios. This will allow a followup commit to reduce the flush rate. Discussion: 20160118163908.GW10941@awork2.anarazel.de http://git.postgresql.org/pg/commitdiff/db76b1efbbab2441428a9ef21f7ac9ba43c52482
  • Allow the WAL writer to flush WAL at a reduced rate. Commit 4de82f7d7 increased the WAL flush rate, mainly to increase the likelihood that hint bits can be set quickly. More quickly set hint bits can reduce contention around the clog et al. But unfortunately the increased flush rate can have a significant negative performance impact, I have measured up to a factor of ~4. The reason for this slowdown is that if there are independent writes to the underlying devices, for example because shared buffers is a lot smaller than the hot data set, or because a checkpoint is ongoing, the fdatasync() calls force cache flushes to be emitted to the storage. This is achieved by flushing WAL only if the last flush was longer than wal_writer_delay ago, or if more than wal_writer_flush_after (new GUC) unflushed blocks are pending. Based on some tests the default for wal_writer_delay is 1MB, which seems to work well both on SSD and rotational media. To avoid negative performance impact due to 4de82f7d7 an earlier commit (db76b1e) made SetHintBits() more likely to succeed; preventing performance regressions in the pgbench tests I performed. Discussion: 20160118163908.GW10941@awork2.anarazel.de http://git.postgresql.org/pg/commitdiff/7975c5e0a992ae9a45e03d145e0d37e2b5a707f5

Tom Lane pushed:

  • Suppress compiler warnings about useless comparison of unsigned to zero. Reportedly, some compilers warn about tests like "c < 0" if c is unsigned, and hence complain about the character range checks I added in commit 3bb3f42f3749d40b8d4de65871e8d828b18d4a45. This is a bit of a pain since the regex library doesn't really want to assume that chr is unsigned. However, since any such reconfiguration would involve manual edits of regcustom.h anyway, we can put it on the shoulders of whoever wants to do that to adjust this new range-checking macro correctly. Per gripes from Coverity and Andres. http://git.postgresql.org/pg/commitdiff/8c95ae81fab11b75a611b57d6aaa0ef77e8b8e41
  • Improve documentation about CREATE INDEX CONCURRENTLY. Clarify the description of which transactions will block a CREATE INDEX CONCURRENTLY command from proceeding, and mention that the index might still not be usable after CREATE INDEX completes. (This happens if the index build detected broken HOT chains, so that pg_index.indcheckxmin gets set, and there are open old transactions preventing the xmin horizon from advancing past the index's initial creation. I didn't want to explain what broken HOT chains are, though, so I omitted an explanation of exactly when old transactions prevent the index from being used.) Per discussion with Chris Travers. Back-patch to all supported branches, since the same text appears in all of them. http://git.postgresql.org/pg/commitdiff/a65313f28bfc264573a066271a11172d109dc2c4
  • Make plpython cope with funny characters in function names. A function name that's double-quoted in SQL can contain almost any characters, but we were using that name directly as part of the name generated for the Python-level function, and Python doesn't like anything that isn't pretty much a standard identifier. To fix, replace anything that isn't an ASCII letter or digit with an underscore in the generated name. This doesn't create any risk of duplicate Python function names because we were already appending the function OID to the generated name to ensure uniqueness. Per bug #13960 from Jim Nasby. Patch by Jim Nasby, modified a bit by me. Back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/66f503868b2ac1163aaf48a2f76d8be02af0bc81
  • Fix multiple bugs in contrib/pgstattuple's pgstatindex() function. Dead or half-dead index leaf pages were incorrectly reported as live, as a consequence of a code rearrangement I made (during a moment of severe brain fade, evidently) in commit d287818eb514d431. The index metapage was not counted in index_size, causing that result to not agree with the actual index size on-disk. Index root pages were not counted in internal_pages, which is inconsistent compared to the case of a root that's also a leaf (one-page index), where the root would be counted in leaf_pages. Aside from that inconsistency, this could lead to additional transient discrepancies between the reported page counts and index_size, since it's possible for pgstatindex's scan to see zero or multiple pages marked as BTP_ROOT, if the root moves due to a split during the scan. With these fixes, index_size will always be exactly one page more than the sum of the displayed page counts. Also, the index_size result was incorrectly documented as being measured in pages; it's always been measured in bytes. (While fixing that, I couldn't resist doing some small additional wordsmithing on the pgstattuple docs.) Including the metapage causes the reported index_size to not be zero for an empty index. To preserve the desired property that the pgstattuple regression test results are platform-independent (ie, BLCKSZ configuration independent), scale the index_size result in the regression tests. The documentation issue was reported by Otsuka Kenji, and the inconsistent root page counting by Peter Geoghegan; the other problems noted by me. Back-patch to all supported branches, because this has been broken for a long time. http://git.postgresql.org/pg/commitdiff/48e6c943e5f11f5d80cabdbcd98f79e3dbad1988
  • Add an explicit representation of the output targetlist to Paths. Up to now, there's been an assumption that all Paths for a given relation compute the same output column set (targetlist). However, there are good reasons to remove that assumption. For example, an indexscan on an expression index might be able to return the value of an expensive function "for free". While we have the ability to generate such a plan today in simple cases, we don't have a way to model that it's cheaper than a plan that computes the function from scratch, nor a way to create such a plan in join cases (where the function computation would normally happen at the topmost join node). Also, we need this so that we can have Paths representing post-scan/join steps, where the targetlist may well change from one step to the next. Therefore, invent a "struct PathTarget" representing the columns we expect a plan step to emit. It's convenient to include the output tuple width and tlist evaluation cost in this struct, and there will likely be additional fields in future. While Path nodes that actually do have custom outputs will need their own PathTargets, it will still be true that most Paths for a given relation will compute the same tlist. To reduce the overhead added by this patch, keep a "default PathTarget" in RelOptInfo, and allow Paths that compute that column set to just point to their parent RelOptInfo's reltarget. (In the patch as committed, actually every Path is like that, since we do not yet have any cases of custom PathTargets.) I took this opportunity to provide some more-honest costing of PlaceHolderVar evaluation. Up to now, the assumption that "scan/join reltargetlists have cost zero" was applied not only to Vars, where it's reasonable, but also PlaceHolderVars where it isn't. Now, we add the eval cost of a PlaceHolderVar's expression to the first plan level where it can be computed, by including it in the PathTarget cost field and adding that to the cost estimates for Paths. This isn't perfect yet but it's much better than before, and there is a way forward to improve it more. This costing change affects the join order chosen for a couple of the regression tests, changing expected row ordering. http://git.postgresql.org/pg/commitdiff/19a541143a09c067ec8cac77ec6a64eb5b1b662b
  • Cosmetic improvements in new config_info code. Coverity griped about use of unchecked strcpy() into a local variable. There's unlikely to be any actual bug there, since no caller would be passing a path longer than MAXPGPATH, but nonetheless use of strlcpy() seems preferable. While at it, get rid of unmaintainable separation between list of field names and list of field values in favor of initializing them in parallel. And we might as well declare get_configdata()'s path argument as const char *, even though no current caller needs that. http://git.postgresql.org/pg/commitdiff/c7a1c5a6b6aa4bbc2c9619edc94368fccc1c8c8e
  • Docs: make prose discussion match the ordering of Table 9-58. The "Session Information Functions" table seems to be sorted mostly alphabetically (although it's not perfect), which would be all right if it didn't lead to some related functions being described in a pretty nonintuitive order. Also, the prose discussions after the table were in an order that hardly matched the table at all. Rearrange to make things a bit easier to follow. http://git.postgresql.org/pg/commitdiff/64a169d1313d6b99b48c2d270df121ef43c03269

Ãlvaro Herrera pushed:

  • pgbench: avoid FD_ISSET on an invalid file descriptor The original code wasn't careful to test the file descriptor returned by PQsocket() for an invalid socket. If an invalid socket did turn up, that would amount to calling FD_ISSET with fd = -1, whereby undefined behavior can be invoked. To fix, test file descriptor for validity and stop further processing if that fails. Problem noticed by Coverity. There is an existing FD_ISSET callsite that does check for invalid sockets beforehand, but the error message reported by it was strerror(errno); in testing the aforementioned change, that turns out to result in "bad socket: Success" which isn't terribly helpful. Instead use PQerrorMessage() in both places which is more likely to contain an useful error message. Backpatch-through: 9.1. http://git.postgresql.org/pg/commitdiff/5df44d14ba9fd3f6149c3fa0919745c9e24bcffe

Tatsuo Ishii pushed:

Michael Meskes pushed:

Bruce Momjian pushed:

Robert Haas pushed:

Peter Eisentraut pushed:

Simon Riggs pushed:

  • Correct StartupSUBTRANS for page wraparound StartupSUBTRANS() incorrectly handled cases near the max pageid in the subtrans data structure, which in some cases could lead to errors in startup for Hot Standby. This patch wraps the pageids correctly, avoiding any such errors. Identified by exhaustive crash testing by Jeff Janes. Jeff Janes http://git.postgresql.org/pg/commitdiff/481725c0ba731b77fb32cadb12013373e378011a

Dean Rasheed pushed:

Andrew Dunstan pushed:

  • Fix two-argument jsonb_object when called with empty arrays Some over-eager copy-and-pasting on my part resulted in a nonsense result being returned in this case. I have adopted the same pattern for handling this case as is used in the one argument form of the function, i.e. we just skip over the code that adds values to the object. Diagnosis and patch from Michael Paquier, although not quite his solution. Fixes bug #13936. Backpatch to 9.5 where jsonb_object was introduced. http://git.postgresql.org/pg/commitdiff/94c745eb189e2122a3ff86c24443b11408ea2376

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Craig Ringer sent in two more revisions of a patch to implement failover slots.

Amit Langote sent in two more revisions of a patch to implement declarative partitioning.

Etsuro Fujita and Ashutosh Bapat traded patches to fix some breakage in foreign_join_ok.

Teodor Sigaev sent in another revision of a patch to add support for box type in SP-GiST index.

Dmitry Ivanov sent in a patch to implement ALTER ... OWNER TO ... CASCADE.

Eugene Kazakov sent in a patch to add an m4 check for the TAP perl modules.

Martin Liška sent in a patch to clean up for -sanitize=use-after-scope.

Fabien COELHO and Robert Haas traded patches to extend pgbench with expressions.

SAWADA Masahiko sent in another revision of a patch to allow N>1 synchronous standby servers.

Alexander Korotkov sent in three more revisions of a patch to implement access method extensibility.

Robbie Harwood sent in another revision of a patch to implement GSSAPI encryption support.

Filip Rembiałkowski sent in two more revisions of a patch to make NOTIFY list de-duplication optional.

Tom Lane sent in two revisions of a patch to add a new function that reports the set of PIDs directly blocking a given PID.

Ashutosh Bapat sent in another revision of a patch to fix the docs for GetExistingLocalJoinPath().

Sehrope Sarkuni sent in a patch to add tab completion for CREATE DATABASE ... TEMPLATE ... to psql.

Julien Rouhaud sent in two more revisions of a patch to add an auto_explain sample rate.

Haribabu Kommi sent in another revision of a patch to implement aggregation in parallel.

Kyotaro HORIGUCHI sent in a patch to add an additional member in the struct ErrorData to hold a message id.

Artur Zakirov sent in two more revisions of a patch to improve Hunspell dictionary support.

SAWADA Masahiko sent in three more revisions of a patch to add a "frozen" bit to the visibility map.

Jim Nasby and Pavel Stěhule traded patches to add a parse_ident() function.

Pavel Stěhule sent in three more revisions of a patch to add an ereport function to PL/PythonU.

Ashutosh Bapat sent in a patch to allow pushing down sorted joins to the PostgreSQL FDW.

Michaël Paquier sent in another revision of a patch to add in-core regression tests.

Amit Kapila sent in a patch to add prepared statement support for parallel query.

Artur Zakirov sent in another revision of a patch to add fuzzy substring searching to the pg_trgm extension.

Stas Kelvich sent in another revision of a patch to add tsvector editing functions.

Constantin S. Pan sent in two more revisions of a patch to speed up building GIN indexes with parallel workers.

Etsuro Fujita sent in another revision of a patch to speed up writes to foreign tables.

Kyotaro HORIGUCHI sent in a patch to make the SQL parser part of psqlscan independent of psql, get pgbench to use same, and change the way to hold a command list from an array to a linked list.

Suraj Kharage sent in a patch to add regression tests for multisync (>1 synchronous replica) replication.

Dmitry Dolgov sent in a patch to enable inserting a new value into an array at arbitrary position in jsonb.

Anastasia Lubennikova sent in two more revisions of a patch to store duplicates more efficiently in a B-tree index.

Takashi Horikawa sent in two revisions of a patch to fix a typo in bufmgr.c that results in waste of memory.

Christoph Berg sent in three revisions of a patch to relax the permission checks on SSL keys.

Kyotaro HORIGUCHI sent in a patch to add a mult-socket version of WaitLatchOrSocket.

Corey Huinker sent in a patch to add \gexec to psql.

Peter Eisentraut sent in a patch to fix some warnings generated by GCC 6.

Andres Freund sent in another revision of a patch to allow to trigger kernel writeback after a configurable number of writes, and atop this, allow checkpoint sorting and balancing.

Amit Kapila sent in another revision of a patch to speed up clog Access by increasing CLOG buffers.

Pavel Stěhule sent in another revision of a patch to allow PL/pgsql to use %ARRAYTYPE and %ELEMENTTYPE in declarations.

Michaël Paquier sent in another revision of a patch to add hot standby checkpoints.

Tom Lane sent in a patch to get rid of lockGroupLeaderIdentifier.

par N Bougain le lundi 22 février 2016 à 02h43

mercredi 17 février 2016

Actualités PostgreSQL.fr

Sortie de PostgreSQL 9.5.1, 9.4.6, 9.3.11, 9.2.15 et 9.1.20

Le PostgreSQL Global Development Group a publié une mise à jour de toutes les versions supportées du SGBD, incluant les versions 9.5.1, 9.4.6, 9.3.11, 9.2.15 et 9.1.20. Ces versions mineures corrigent deux problèmes de sécurité, ainsi qu'un certain nombre de problèmes découverts sur les quatre derniers mois. Les utilisateurs vulnérables aux problèmes de sécurité doivent mettre à jour leur installation immédiatement; les autres doivent planifier la mise à jour dès que possible.

Correction d'un problème de sécurité dans les expressions régulières, PL/Java

Cette publication clôt la faille de sécurité CVE-2016-0773, un problème avec l'analyse d'expressions régulières (regex). Le code précédent permettait à l'utilisateur de passer dans les expressions des valeurs en dehors de l'intervalle de caractères unicodes, déclenchant l'arrêt brutal du backend. Ce problème est critique pour les systèmes PostgreSQL avec des utilisateurs non approuvés (untrusted) ou qui génèrent des regex basées sur une entrée utilisateur.

Cette mise à jour corrige aussi CVE-2016-0766, un problème d'élévation de privilège pour les utilisateurs de PL/Java. Certains paramètres de configuration particuliers (GUCS) pour le PL/Java sont maintenant seulement modifiables par le super-utilisateur de la base de données.

Autres corrections et améliorations

En plus des corrections indiquées ci-dessus, de nombreuses corrections de problèmes rapportés par les utilisateurs ces derniers mois ont été inclues. Ceci inclut notamment de nombreuses corrections sur les nouvelles fonctionnalités introduites en version 9.5.0, ainsi que la réécriture de pg_dump pour éliminer des problèmes chroniques dans la sauvegarde des extensions. Parmi eux il y a :

  • Correction de nombreux problèmes dans pg_dump avec certains types d'objets
  • Prévention de l'affaissement over-eager des clauses HAVING pour les GROUPING SETS
  • Correction de la transformation en chaîne de caractères des erreurs avec les clauses ON CONFLICT ... WHERE
  • Correction d'erreurs de tableoid pour postgres_fdw
  • Prévention d'exception sur les floating-point dans pgbench
  • Fait en sorte que \det recherche toujours les noms de tables distantes
  • Correction de l'encadrement par quote des noms de contraintes des domaines dans pg_dump
  • Prévention d'introduire des objets étendus dans les noeuds Const
  • Permet la compilation du PL/Java sur Windows
  • Correction de l'erreur "unresolved symbol" dans l'exécution de code PL/Python
  • Permet l'utilisation de Python2 et Python3 dans la même base de donnée
  • Ajout du support de Python 3.5 dans le PL/Python
  • Corrige un problème avec la création de sous répertoire pendant l'initdb
  • Fait en sorte que pg_ctl rapporte le statut correctement sur Windows
  • Supprime une erreur apportant de la confusion quand pg_receivexlog est utilisé avec de vieux serveurs
  • Nombreuses corrections et ajouts dans la documentation
  • Correction de calculs de hash erronés dans gin_extract_jsonb_path()

Cette publication contient aussi la version 2016a de tzdata, qui réalise les mises à jour des timezones pour les zones Iles Caiman, Metlakatla, Trans-Baikal Territory (Zabaykalsky Krai), et Pakistan.

Comment mettre à jour ?

Les utilisateurs des versions 9.4 devront reindexer tout index jsonb_path_ops déjà créé, pour corriger un problème persistant d'entrées manquantes dans ces index.

Comme avec les autres versions mineures, les utilisateurs n'ont pas besoin de sauvegarder et restaurer les bases de données ou d'utiliser pg_upgrade pour mettre en place cette mise à jour. Vous pouvez simplement arrêter PostgreSQL, mettre à jour les binaires et le redémarrer. Les utilisateurs qui ont ignoré plusieurs mises à jour mineures peuvent avoir besoin de réaliser des opérations après mise à jour. Voir les notes de version pour les détails.

Liens:

par daamien le mercredi 17 février 2016 à 13h36

lundi 15 février 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 14 février 2016

Les mises à jour de sécurité 9.5.1, 9.4.6, 9.3.11, 9.2.15 et 9.1.20 ont été publiées. Mettez à jour dès que possible ! http://www.postgresql.org/about/news/1644/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en février

PostgreSQL Local

  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016 : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France). L'appel à conférenciers court jusqu'au 29 février à l'adresse call-for-paper AT postgresql-sessions POINT org.
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. Les candidatures de conférenciers sont encore acceptées : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/
  • "5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 28 février : http://5432meet.us/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160215010107.GC4274@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Heikki Linnakangas pushed:

Magnus Hagander pushed:

Michael Meskes pushed:

Fujii Masao pushed:

  • Various fixes to "ALTER ... SET/RESET" tab completions. Add: ALTER SYSTEM SET/RESET ... -> GUC variables, ALTER TABLE ... SET WITH -> OIDS, ALTER DATABASE/FUNCTION/ROLE/USER ... SET/RESET -> GUC variables, ALTER DATABASE/FUNCTION/ROLE/USER ... SET ... -> FROM CURRENT/TO, ALTER DATABASE/FUNCTION/ROLE/USER ... SET ... TO/= -> possible values, Author: Fujii Masao. Reviewed-by: Michael Paquier, Masahiko Sawada http://git.postgresql.org/pg/commitdiff/89611c4dfa67630f7dcc25881c17cbd1b2e24ea1
  • Make GIN regression test stable. Commit 7f46eaf added the regression test which checks that gin_clean_pending_list() cleans up the GIN pending list and returns >0. This usually works fine. But if autovacuum comes along and cleans the list before gin_clean_pending_list() starts, the function may return 0, and then the regression test may fail. To fix the problem, this commit disables autovacuum on the target index of gin_clean_pending_list() by setting autovacuum_enabled reloption to off when creating the table. Also this commit sets gin_pending_list_limit reloption to 4MB on the target index. Otherwise when running "make installcheck" with small gin_pending_list_limit GUC, insertions of data may trigger the cleanup of pending list before gin_clean_pending_list() starts and the function may return 0. This could cause the regression test to fail. Per buildfarm member spoonbill. Reported-By: Tom Lane http://git.postgresql.org/pg/commitdiff/f8a1c1d5a30003c9c24b00870d5a0f02f1c81e65

Robert Haas pushed:

  • pgbench: Install guards against obscure overflow conditions. Dividing INT_MIN by -1 or taking INT_MIN modulo -1 can sometimes cause floating-point exceptions or otherwise misbehave. Fabien Coelho and Michael Paquier http://git.postgresql.org/pg/commitdiff/64f5edca2401f6c2f23564da9dd52e92d08b3a20
  • Make all built-in lwlock tranche IDs fixed. This makes the values more stable, which seems like a good thing for anybody who needs to look at at them. Alexander Korotkov and Amit Kapila http://git.postgresql.org/pg/commitdiff/7191ce8bea0cb110a28faef178efa92bf456e030
  • postgres_fdw: Allow fetch_size to be set per-table or per-server. The default fetch size of 100 rows might not be right in every environment, so allow users to configure it. Corey Huinker, reviewed by Kyotaro Horiguchi, Andres Freund, and me. http://git.postgresql.org/pg/commitdiff/dc203dc3ac40a4b02b92fb827848a547d2957153
  • Remove CustomPath's TextOutCustomPath method. You can't really do anything useful with this in the form it currently exists; among other problems, there's no way to reread whatever information might be produced when the path is output. Work is underway to replace this with a more useful and more general system of extensible nodes, but let's start by getting rid of this bit. Extracted from a larger patch by KaiGai Kohei. http://git.postgresql.org/pg/commitdiff/f2305d40ec20e63f781983d103d819ad2b6c0faf
  • Code review for commit dc203dc3ac40a4b02b92fb827848a547d2957153. Remove duplicate assignment. This part by Ashutosh Bapat. Remove now-obsolete comment. This part by me, although the pending join pushdown patch does something similar, and for the same reason: there's no reason to keep two lists of the things in the fdw_private structure that have to be kept in sync with each other. http://git.postgresql.org/pg/commitdiff/52b63649fc5ff5d86227b8905e1c79cd9ceddf4c
  • Allow parallel custom and foreign scans. This patch doesn't put the new infrastructure to use anywhere, and indeed it's not clear how it could ever be used for something like postgres_fdw which has to send an SQL query and wait for a reply, but there might be FDWs or custom scan providers that are CPU-bound, so let's give them a way to join club parallel. KaiGai Kohei, reviewed by me. http://git.postgresql.org/pg/commitdiff/69d34408e5e7adcef8ef2f4e9c4f2919637e9a06
  • Extend sortsupport for text to more opclasses. Have varlena.c expose an interface that allows the char(n), bytea, and bpchar types to piggyback on a now-generalized SortSupport for text. This pushes a little more knowledge of the bpchar/char(n) type into varlena.c than might be preferred, but that seems like the approach that creates least friction. Also speed things up for index builds that use text_pattern_ops or varchar_pattern_ops. This patch does quite a bit of renaming, but it seems likely to be worth it, so as to avoid future confusion about the fact that this code is now more generally used than the old names might have suggested. Peter Geoghegan, reviewed by Ãlvaro Herrera and Andreas Karlsson, with small tweaks by me. http://git.postgresql.org/pg/commitdiff/b47b4dbf683f13e6ef09fa0d93aa6e84f3d00819
  • Change the way that LWLocks for extensions are allocated. The previous RequestAddinLWLocks() method had several disadvantages. First, the locks would be in the main tranche; we've recently decided that it's useful for LWLocks used for separate purposes to have separate tranche IDs. Second, there wasn't any correlation between what code called RequestAddinLWLocks() and what code called LWLockAssign(); when multiple modules are in use, it could become quite difficult to troubleshoot problems where LWLockAssign() ran out of locks. To fix, create a concept of named LWLock tranches which can be used either by extension or by core code. Amit Kapila and Robert Haas http://git.postgresql.org/pg/commitdiff/c1772ad9225641c921545b35c84ee478c326b95e
  • Add some additional core functions to support join pushdown for FDWs. GetExistingLocalJoinPath() is useful for handling EvalPlanQual rechecks properly, and GetUserMappingById() is needed to make sure you're using the right credentials. Shigeru Hanada, Etsuro Fujita, Ashutosh Bapat, Robert Haas http://git.postgresql.org/pg/commitdiff/a104a017fc5f67ff5d9c374cd831ac3948a874c2
  • When modifying a foreign table, initialize tableoid field properly. Failure to do this can cause AFTER ROW triggers or RETURNING expressions that reference this field to misbehave. Etsuro Fujita, reviewed by Thom Brown http://git.postgresql.org/pg/commitdiff/9418d79a7664e75a2824adfc78b859b4d0f77962
  • postgres_fdw: Avoid possible misbehavior when RETURNING tableoid column only. deparseReturningList ended up adding up RETURNING NULL to the code, but code elsewhere saw an empty list of attributes and concluded that it should not expect tuples from the remote side. Etsuro Fujita and Robert Haas, reviewed by Thom Brown http://git.postgresql.org/pg/commitdiff/37c84570b1e32aef886c9b546e0dd4a128cb7492
  • postgres_fdw: pgindent run. In preparation for upcoming commits. http://git.postgresql.org/pg/commitdiff/d0cd7bda97a626049aa7d247374909c52399c413
  • Fix typo. Amit Kapila http://git.postgresql.org/pg/commitdiff/78bea62ab0b16a0c7aaa1e460064c32f9f35041d
  • Fix small goof in comment. Peter Geoghegan http://git.postgresql.org/pg/commitdiff/63f39b9148319c2e399dd827941b4d579b79f18b
  • Remove parallel-safety check from GetExistingLocalJoinPath. Commit a104a017fc5f67ff5d9c374cd831ac3948a874c2 has this check because I added it to the submitted patch before commit, but that was entirely wrongheaded, as explained to me by Ashutosh Bapat, who also wrote this patch. http://git.postgresql.org/pg/commitdiff/e0e7b8fa22539a81cc390f8ec57d6b52391b1335
  • Fix typo in comment. Michael Paquier http://git.postgresql.org/pg/commitdiff/e98fd7860773698eaaf6332decc364bb31bca677
  • Introduce group locking to prevent parallel processes from deadlocking. For locking purposes, we now regard heavyweight locks as mutually non-conflicting between cooperating parallel processes. There are some possible pitfalls to this approach that are not to be taken lightly, but it works OK for now and can be changed later if we find a better approach. Without this, it's very easy for parallel queries to silently self-deadlock if the user backend holds strong relation locks. Robert Haas, with help from Amit Kapila. Thanks to Noah Misch and Andres Freund for extensive discussion of possible issues with this approach. http://git.postgresql.org/pg/commitdiff/a1c1af2a1f6099c039f145c1edb52257f315be51
  • Introduce a new GUC force_parallel_mode for testing purposes. When force_parallel_mode = true, we enable the parallel mode restrictions for all queries for which this is believed to be safe. For the subset of those queries believed to be safe to run entirely within a worker, we spin up a worker and run the query there instead of running it in the original process. When force_parallel_mode = regress, make additional changes to allow the regression tests to run cleanly even though parallel workers have been injected under the hood. Taken together, this facilitates both better user testing and better regression testing of the parallelism code. Robert Haas, with help from Amit Kapila and Rushabh Lathia. http://git.postgresql.org/pg/commitdiff/7c944bd903392829608a9fba5b0e68c4fe89abf8
  • Fix parallel-safety markings for pg_upgrade functions. These establish backend-local state which will not be copied to parallel workers, so they must be marked parallel-restricted, not parallel-safe. http://git.postgresql.org/pg/commitdiff/d89f06f0482458d4b76e3be67ea428fec2a0aeb6
  • postgres_fdw: Push down joins to remote servers. If we've got a relatively straightforward join between two tables, this pushes that join down to the remote server instead of fetching the rows for each table and performing the join locally. Some cases are not handled yet, such as SEMI and ANTI joins. Also, we don't yet attempt to create presorted join paths or parameterized join paths even though these options do get tried for a base relation scan. Nevertheless, this seems likely to be a very significant win in many practical cases. Shigeru Hanada and Ashutosh Bapat, reviewed by Robert Haas, with additional review at various points by Tom Lane, Etsuro Fujita, KaiGai Kohei, and Jeevan Chalke. http://git.postgresql.org/pg/commitdiff/e4106b2528727c4b48639c0e12bf2f70a766b910
  • postgres_fdw: Remove unstable regression test. Per Tom Lane and the buildfarm. http://git.postgresql.org/pg/commitdiff/bb4df42e6a394ce77801b6952b6dc8b43d91fea7
  • postgres_fdw: Remove unnecessary variable. It causes warnings in non-Assert-enabled builds. Per report from Jeff Janes. http://git.postgresql.org/pg/commitdiff/019e78813760e664a85f505b5953d362a2b468cc
  • Code cleanup in the wake of recent LWLock refactoring. As of commit c1772ad9225641c921545b35c84ee478c326b95e, there's no longer any way of requesting additional LWLocks in the main tranche, so we don't need NumLWLocks() or LWLockAssign() any more. Also, some of the allocation counters that we had previously aren't needed any more either. Amit Kapila http://git.postgresql.org/pg/commitdiff/79a7ff0fe56ac9d782b0734ebb0e7a5299015e58
  • Specify permutations for isolation tests with "invalid" permutations. This is a necessary prerequisite for forthcoming changes to allow deadlock scenarios to be tested by the isolation tester. It is also a good idea on general principle, since these scenarios add no useful test coverage not provided by other scenarios, but do to take time to execute. http://git.postgresql.org/pg/commitdiff/c9882c60f44cf5d0b37411535175a5c154fdad0e
  • Modify the isolation tester so that multiple sessions can wait. This allows testing of deadlock scenarios. Scenarios that would previously have been considered invalid are now simply taken as a scenario in which more than one backend will wait. http://git.postgresql.org/pg/commitdiff/38f8bdcac4982215beb9f65a19debecaf22fd470
  • Rename PGPROC fields related to group XID clearing again. Commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a introduced a new facility to reduce ProcArrayLock contention by clearing several XIDs from the ProcArray under a single lock acquisition. The names initially chosen were deemed not to be very good choices, so commit 4aec49899e5782247e134f94ce1c6ee926f88e1c renamed them. But now it seems like we still didn't get it right. A pending patch wants to add similar infrastructure for batching CLOG updates, so the names need to be clear enough to allow a new set of structure members with a related purpose. Amit Kapila http://git.postgresql.org/pg/commitdiff/a455878d99561d4b199915ed7a7fb02f5e621710
  • Add some isolation tests for deadlock detection and resolution. Previously, we had no test coverage for the deadlock detector. http://git.postgresql.org/pg/commitdiff/4c9864b9b4d87d02f07f40bb27976da737afdcab
  • Use separate lwlock tranches for buffer, lock, and predicate lock managers. This finishes the work - spread across many commits over the last several months - of putting each type of lock other than the named individual locks into a separate tranche. Amit Kapila http://git.postgresql.org/pg/commitdiff/c319991bcad02a2e99ddac3f42762b0f6fa8d52a
  • Make builtin lwlock tranche names consistent. Previously, we had a mix of styles. Amit Kapila http://git.postgresql.org/pg/commitdiff/63461a63f94a333eae272be3d44ae1602cda75cb
  • Introduce extensible node types. An extensible node is always tagged T_Extensible, but the extnodename field identifies it more specifically; it may also include arbitrary private data. Extensible nodes can be copied, tested for equality, serialized, and deserialized, but the core system doesn't know anything about them otherwise. Some extensions may find it useful to include these nodes in fdw_private or custom_private lists in lieu of arm-wrestling their data into a format that the core code can understand. Along the way, so as not to burden the authors of such extensible node types too much, expose the functions for writing serialized tokens, and for serializing and deserializing bitmapsets. KaiGai Kohei, per a design suggested by me. Reviewed by Andres Freund and by me, and further edited by me. http://git.postgresql.org/pg/commitdiff/bcac23de73b89b001fbc628d84471a392e928d1c

Ãlvaro Herrera pushed:

Teodor Sigaev pushed:

  • Fix lossy KNN GiST when ordering operator returns non-float8 value. KNN GiST with recheck flag should return to executor the same type as ordering operator, GiST detects this type by looking to return type of function which implements ordering operator. But occasionally detecting code works after replacing ordering operator function to distance support function. Distance support function always returns float8, so, detecting code get float8 instead of actual return type of ordering operator. Built-in opclasses don't have ordering operator which doesn't return non-float8 value, so, tests are impossible here, at least now. Backpatch to 9.5 where lozzy KNN was introduced. Author: Alexander Korotkov Report by: Artur Zakirov http://git.postgresql.org/pg/commitdiff/f25d07d99f4acf136baed4ef29ea97faad7337db
  • Improve error reporting in format() Clarify invalid format conversion type error message and add hint. Author: Jim Nasby http://git.postgresql.org/pg/commitdiff/07d25a964b2fb78169a4a34c6f6893736f69903a

Tom Lane pushed:

  • Fix pg_description entries for jsonb_to_record() and jsonb_to_recordset(). All the other jsonb function descriptions refer to the arguments as being "jsonb", but these two said "json". Make it consistent. Per bug #13905 from Petru Florin Mihancea. No catversion bump --- we can't force one in the back branches, and this isn't very critical anyway. http://git.postgresql.org/pg/commitdiff/a4627e8fd479ff74fffdd49ad07636b79751be45
  • Remove unnecessary "implementation of FOO operator" DESCR() entries. Apparently at least one committer hasn't gotten the word that these do not need to be maintained by hand, since initdb will create them automatically. Noted while fixing bug #13905. No catversion bump since the post-initdb state is exactly the same either way. I don't see a need for back-patch, either. http://git.postgresql.org/pg/commitdiff/2ad83fff221eec2cc76f8823b0043763d0dfe0c3
  • Remove printQueryOpt.quote field. This field was included in the original definition of the printQueryOpt struct in commit a45195a191eec367, but it was not used anywhere in that commit, nor since then. Spotted by Dickson S. Guedes. http://git.postgresql.org/pg/commitdiff/2808a2e0f3e7dd98f5dc3041183fd5f389e0a8e1
  • Fix IsValidJsonNumber() to notice trailing non-alphanumeric garbage. Commit e09996ff8dee3f70 was one brick shy of a load: it didn't insist that the detected JSON number be the whole of the supplied string. This allowed inputs such as "2016-01-01" to be misdetected as valid JSON numbers. Per bug #13906 from Dmitry Ryabov. In passing, be more wary of zero-length input (I'm not sure this can happen given current callers, but better safe than sorry), and do some minor cosmetic cleanup. http://git.postgresql.org/pg/commitdiff/e6ecc93a1747624c4d33fa48d8a2d77319f3400f
  • Make hstore_to_jsonb_loose match hstore_to_json_loose on what's a number. Commit e09996ff8dee3f70 removed some ad-hoc code in hstore_to_json_loose that determined whether an hstore value string looked like a number, in favor of calling the JSON parser's is-it-a-number code. However, it neglected the fact that the exact same code appeared in hstore_to_jsonb_loose. This is not a bug, exactly, because the requirements on the two functions are not the same: hstore_to_json_loose must accept only syntactically legal JSON numbers as numbers, or it will produce invalid JSON output, as per bug #12070 which spawned the prior commit. But hstore_to_jsonb_loose could accept anything that numeric_in will eat, other than Inf and NaN. Nonetheless it seems surprising and arbitrary that the two functions don't use the same rules for what is a number versus what is a string; especially since they did use the same rules before the aforesaid commit. For one thing, that means that doing hstore_to_json_loose and then casting to jsonb can produce results different from doing just hstore_to_jsonb_loose. Hence, change hstore_to_jsonb_loose's logic to match hstore_to_json_loose, ie, hstore values are treated as numbers when they match the JSON syntax for numbers. No back-patch, since this is more in the nature of a definitional change than a bug fix. http://git.postgresql.org/pg/commitdiff/41d2c081ce659f40dec3eb9efc647082aa775eb4
  • Add hstore_to_jsonb() and hstore_to_jsonb_loose() to hstore documentation. These were never documented anywhere user-visible. Tut tut. http://git.postgresql.org/pg/commitdiff/24a26c9f5448b24943df4c9bcf154bfd9f8197a6
  • In pg_dump, ensure that view triggers are processed after view rules. If a view is split into CREATE TABLE + CREATE RULE to break a circular dependency, then any triggers on the view must be dumped/reloaded after the CREATE RULE; else the backend may reject the CREATE TRIGGER because it's the wrong type of trigger for a plain table. This works all right in plain dump/restore because of pg_dump's sorting heuristic that places triggers after rules. However, when using parallel restore, the ordering must be enforced by a dependency --- and we didn't have one. Fixing this is a mere matter of adding an addObjectDependency() call, except that we need to be able to find all the triggers belonging to the view relation, and there was no easy way to do that. Add fields to pg_dump's TableInfo struct to remember where the associated TriggerInfo struct(s) are. Per bug report from Dennis Kögel. The failure can be exhibited at least as far back as 9.1, so back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/0ed707e9b7e90891d0eda91b353edf3a69c4b7c4
  • Simplify syntax diagram for REINDEX. Since there currently is only one possible parenthesized option, namely VERBOSE, it's a bit pointless to show it with "{ } [, ... ]". The curly braces are useless and therefore confusing, as seen in a recent question from Karsten Hilbert. Remove the extra decoration for the time being; we can put it back when and if REINDEX grows some more options. http://git.postgresql.org/pg/commitdiff/5ef244a28266ce8e5666b23baed33a4c238542ff
  • Add num_nulls() and num_nonnulls() to count NULL arguments. An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that exactly one of a,b,c isn't NULL. The functions are variadic, so they can also be pressed into service to count the number of null or nonnull elements in an array. Marko Tiikkaja, reviewed by Pavel Stehule http://git.postgresql.org/pg/commitdiff/6819514fca22f8554edcab6e4d0402b0221f03bb
  • Update time zone data files to tzdata release 2016a. DST law changes in Cayman Islands, Metlakatla, Trans-Baikal Territory (Zabaykalsky Krai). Historical corrections for Pakistan. http://git.postgresql.org/pg/commitdiff/a73311e5256b57a59677083e71b5bf93e583cc05
  • First-draft release notes for 9.4.6. As usual, the release notes for other branches will be made by cutting these down, but put them up for community review first. http://git.postgresql.org/pg/commitdiff/7008e70d105b572821406744ce080771b74c06ab
  • Add missing "static" qualifier. Per buildfarm member pademelon. http://git.postgresql.org/pg/commitdiff/392998bc58a985ea978c94c23594eb214d04c744
  • Improve HJDEBUG code a bit. Commit 30d7ae3c76d2de144232ae6ab328ca86b70e72c3 introduced an HJDEBUG stanza that probably didn't compile at the time, and definitely doesn't compile now, because it refers to a nonexistent variable. It doesn't seem terribly useful anyway, so just get rid of it. While I'm fooling with it, use %z modifier instead of the obsolete hack of casting size_t to unsigned long, and include the HashJoinTable's address in each printout so that it's possible to distinguish the activities of multiple hashjoins occurring in one query. Noted while trying to use HJDEBUG to investigate bug #13908. Back-patch to 9.5, because code that doesn't compile is certainly not very helpful. http://git.postgresql.org/pg/commitdiff/be11f8400d7d99e8ae6602f3175e04b4f0c99376
  • Fix comment block trashed by pgindent. Looks like I put the protective dashes in the wrong place in f4e4b32743. http://git.postgresql.org/pg/commitdiff/b921aeb1676f128f2c41ddc40d3887964ea9eae9
  • Improve speed of timestamp/time/date output functions. It seems that sprintf(), at least in glibc's version, is unreasonably slow compared to hand-rolled code for printing integers. Replacing most uses of sprintf() in the datetime.c output functions with special-purpose code turns out to give more than a 2X speedup in COPY of a table with a single timestamp column; which is pretty impressive considering all the other logic in that code path. David Rowley and Andres Freund, reviewed by Peter Geoghegan and myself http://git.postgresql.org/pg/commitdiff/aa2387e2fd532954e88dfd8546ab894b9305123d
  • ExecHashRemoveNextSkewBucket must physically copy tuples to main hashtable. Commit 45f6240a8fa9d355 added an assumption in ExecHashIncreaseNumBatches and ExecHashIncreaseNumBuckets that they could find all tuples in the main hash table by iterating over the "dense storage" introduced by that patch. However, ExecHashRemoveNextSkewBucket continued its old practice of simply re-linking deleted skew tuples into the main table's hashchains. Hence, such tuples got lost during any subsequent increase in nbatch or nbuckets, and would never get joined, as reported in bug #13908 from Seth P. I (tgl) think that the aforesaid commit has got multiple design issues and should be reworked rather completely; but there is no time for that right now, so band-aid the problem by making ExecHashRemoveNextSkewBucket physically copy deleted skew tuples into the "dense storage" arena. The added test case is able to exhibit the problem by means of fooling the planner with a WHERE condition that it will underestimate the selectivity of, causing the initial nbatch estimate to be too small. Tomas Vondra and Tom Lane. Thanks to David Johnston for initial investigation into the bug report. http://git.postgresql.org/pg/commitdiff/f867ce5518202a4e625dc41b7036fec47ee0e09e
  • Release notes for 9.5.1, 9.4.6, 9.3.11, 9.2.15, 9.1.20. http://git.postgresql.org/pg/commitdiff/1d76c9725087121bfa008f875450570a5c46241f
  • Fix deparsing of ON CONFLICT arbiter WHERE clauses. The parser doesn't allow qualification of column names appearing in these clauses, but ruleutils.c would sometimes qualify them, leading to dump/reload failures. Per bug #13891 from Onder Kalaci. (In passing, make stanzas in ruleutils.c that save/restore varprefix more consistent.) Peter Geoghegan http://git.postgresql.org/pg/commitdiff/cc2ca9319a5dbe89ea47d87944650e65e3bb4ce8
  • Improve documentation about PRIMARY KEY constraints. Get rid of the false implication that PRIMARY KEY is exactly equivalent to UNIQUE + NOT NULL. That was more-or-less true at one time in our implementation, but the standard doesn't say that, and we've grown various features (many of them required by spec) that treat a pkey differently from less-formal constraints. Per recent discussion on pgsql-general. I failed to resist the temptation to do some other wordsmithing in the same area. http://git.postgresql.org/pg/commitdiff/c477e84fe2471cb675234fce75cd6bb4bc2cf481
  • Use %u not %d to print OIDs. Oversight in commit 96198d94c. Etsuro Fujita http://git.postgresql.org/pg/commitdiff/63828969c822449744e63b76bff993ccd1d3245e
  • Re-pgindent varlena.c. Just to make sure previous commit worked ... http://git.postgresql.org/pg/commitdiff/0231f838565d2921a0960407c4240237ba1d56ae
  • Rename typedef "string" to "VarString". Since pgindent treats typedef names as global, the original coding of b47b4dbf683f13e6 would have had rather nasty effects on the formatting of other files in which "string" is used as a variable or field name. Use a less generic name for this typedef, and rename some other identifiers to match. Peter Geoghegan, per gripe from me http://git.postgresql.org/pg/commitdiff/58e797216ff52c0656d3c343d0732a2530cafb71
  • Temporarily make pg_ctl and server shutdown a whole lot chattier. This is a quick hack, due to be reverted when its purpose has been served, to try to gather information about why some of the buildfarm critters regularly fail with "postmaster does not shut down" complaints. Maybe they are just really overloaded, but maybe something else is going on. Hence, instrument pg_ctl to print the current time when it starts waiting for postmaster shutdown and when it gives up, and add a lot of logging of the current time in the server's checkpoint and shutdown code paths. No attempt has been made to make this pretty. I'm not even totally sure if it will build on Windows, but we'll soon find out. http://git.postgresql.org/pg/commitdiff/3971f64843b02e4a55d854156bd53e46a0588e45
  • Fix some regex issues with out-of-range characters and large char ranges. Previously, our regex code defined CHR_MAX as 0xfffffffe, which is a bad choice because it is outside the range of type "celt" (int32). Characters approaching that limit could lead to infinite loops in logic such as "for (c = a; c <= b; c++)" where c is of type celt but the range bounds are chr. Such loops will work safely only if CHR_MAX+1 is representable in celt, since c must advance to beyond b before the loop will exit. Fortunately, there seems no reason not to restrict CHR_MAX to 0x7ffffffe. It's highly unlikely that Unicode will ever assign codes that high, and none of our other backend encodings need characters beyond that either. In addition to modifying the macro, we have to explicitly enforce character range restrictions on the values of \u, \U, and \x escape sequences, else the limit is trivially bypassed. Also, the code for expanding case-independent character ranges in bracket expressions had a potential integer overflow in its calculation of the number of characters it could generate, which could lead to allocating too small a character vector and then overwriting memory. An attacker with the ability to supply arbitrary regex patterns could easily cause transient DOS via server crashes, and the possibility for privilege escalation has not been ruled out. Quite aside from the integer-overflow problem, the range expansion code was unnecessarily inefficient in that it always produced a result consisting of individual characters, abandoning the knowledge that we had a range to start with. If the input range is large, this requires excessive memory. Change it so that the original range is reported as-is, and then we add on any case-equivalent characters that are outside that range. With this approach, we can bound the number of individual characters allowed without sacrificing much. This patch allows at most 100000 individual characters, which I believe to be more than the number of case pairs existing in Unicode, so that the restriction will never be hit in practice. It's still possible for range() to take awhile given a large character code range, so also add statement-cancel detection to its loop. The downstream function dovec() also lacked cancel detection, and could take a long time given a large output from range(). Per fuzz testing by Greg Stark. Back-patch to all supported branches. Security: CVE-2016-0773 http://git.postgresql.org/pg/commitdiff/3bb3f42f3749d40b8d4de65871e8d828b18d4a45
  • Last-minute updates for release notes. Security: CVE-2016-0773 http://git.postgresql.org/pg/commitdiff/02292845ac6d6ec09d79abf1dbb0538e14582743
  • Add more chattiness in server shutdown. Early returns from the buildfarm show that there's a bit of a gap in the logging I added in 3971f64843b02e4a: the portion of CreateCheckPoint() after CheckPointGuts() can take a fair amount of time. Add a few more log messages in that section of code. This too shall be reverted later. http://git.postgresql.org/pg/commitdiff/7351e18286ec83461b386e23328d65fd4a538bba
  • Add still more chattiness in server shutdown. Further investigation says that there may be some slow operations after we've finished ShutdownXLOG(), so add some more log messages to try to isolate that. This is all temporary code too. http://git.postgresql.org/pg/commitdiff/41d505a7ffaf8c1678b931e15f74469c84fbb91e
  • Revert "Temporarily make pg_ctl and server shutdown a whole lot chattier." This reverts commit 3971f64843b02e4a55d854156bd53e46a0588e45 and a couple of followon debugging commits; I think we've learned what we can from them. http://git.postgresql.org/pg/commitdiff/c5e9b771275b93b09eec6b760677fe6c5e726ab2
  • Avoid use of sscanf() to parse ispell dictionary files. It turns out that on FreeBSD-derived platforms (including OS X), the *scanf() family of functions is pretty much brain-dead about multibyte characters. In particular it will apply isspace() to individual bytes of input even when those bytes are part of a multibyte character, thus allowing false recognition of a field-terminating space. We appear to have little alternative other than instituting a coding rule that *scanf() is not to be used if the input string might contain multibyte characters. (There was some discussion of relying on "%ls", but that probably just moves the portability problem somewhere else, and besides it doesn't fully prevent BSD *scanf() from using isspace().) This patch is a down payment on that: it gets rid of use of sscanf() to parse ispell dictionary files, which are certainly at great risk of having a problem. The code is cleaner this way anyway, though a bit longer. In passing, improve a few comments. Report and patch by Artur Zakirov, reviewed and somewhat tweaked by me. Back-patch to all supported branches. http://git.postgresql.org/pg/commitdiff/51e78ab4ff3282963f5e8ba2633040829413aefa
  • Code review for isolationtester changes. Fix a few oversights in 38f8bdcac4982215beb9f65a19debecaf22fd470: don't leak memory in run_permutation(), remember when we've issued a cancel rather than issuing another one every 10ms, fix some typos in comments. http://git.postgresql.org/pg/commitdiff/d9dc2b4149c017c0a1d2045b858e8e0cc1a92464
  • Make new deadlock isolation test more reproducible. The original formulation of 4c9864b9b4d87d02f07f40bb27976da737afdcab was extremely timing-sensitive, because it arranged for the deadlock detector to be running (and possibly unblocking the current query) at almost exactly the same time as isolationtester would be probing to see if the query is blocked. The committed expected-file assumed that the deadlock detection would finish first, but we see the opposite on both fast and slow buildfarm animals. Adjust the deadlock timeout settings to make it predictable that isolationtester *will* see the query as waiting before deadlock detection unblocks it. I used a 5s timeout for the same reasons mentioned in a7921f71a3c747141344d8604f6a6d7b4cddb2a9. http://git.postgresql.org/pg/commitdiff/b11d07b6a3fc64904731e3b9a467a2567bc7dcdb
  • Shift the responsibility for emitting "database system is shut down". Historically this message has been emitted at the end of ShutdownXLOG(). That's not an insane place for it in a standalone backend, but in the postmaster environment we've grown a fair amount of stuff that happens later, including archiver/walsender shutdown, stats collector shutdown, etc. Recent buildfarm experimentation showed that on slower machines there could be many seconds' delay between finishing ShutdownXLOG() and actual postmaster exit. That's fairly confusing, both for testing purposes and for DBAs. Hence, move the code that prints this message into UnlinkLockFiles(), so that it comes out just after we remove the postmaster's pidfile. That is a more appropriate definition of "is shut down" from the point of view of "pg_ctl stop", for example. In general, removing the pidfile should be the last externally-visible action of either a postmaster or a standalone backend; compare commit d73d14c271653dff10c349738df79ea03b85236c for instance. So this seems like a reasonably future-proof approach. http://git.postgresql.org/pg/commitdiff/d18643c4a6d5ac41b012abc5d11fb5a7ccddf6c5
  • Fix typo in comment. http://git.postgresql.org/pg/commitdiff/2564be360a1d25a4c66e7cd34997ab027e0ec9a8
  • Move pg_constraint.h function declarations to new file pg_constraint_fn.h. A pending patch requires exporting a function returning Bitmapset from catalog/pg_constraint.c. As things stand, that would mean including nodes/bitmapset.h in pg_constraint.h, which might be hazardous for the client-side includability of that header. It's not entirely clear whether any client-side code needs to include pg_constraint.h, but it seems prudent to assume that there is some such code somewhere. Therefore, split off the function definitions into a new file pg_constraint_fn.h, similarly to what we've done for some other catalog header files. http://git.postgresql.org/pg/commitdiff/72eee410d48dfb4e6f3a0b751c4b0057ca8adc81
  • Remove GROUP BY columns that are functionally dependent on other columns. If a GROUP BY clause includes all columns of a non-deferred primary key, as well as other columns of the same relation, those other columns are redundant and can be dropped from the grouping; the pkey is enough to ensure that each row of the table corresponds to a separate group. Getting rid of the excess columns will reduce the cost of the sorting or hashing needed to implement GROUP BY, and can indeed remove the need for a sort step altogether. This seems worth testing for since many query authors are not aware of the GROUP-BY-primary-key exception to the rule about queries not being allowed to reference non-grouped-by columns in their targetlists or HAVING clauses. Thus, redundant GROUP BY items are not uncommon. Also, we can make the test pretty cheap in most queries where it won't help by not looking up a rel's primary key until we've found that at least two of its columns are in GROUP BY. David Rowley, reviewed by Julien Rouhaud http://git.postgresql.org/pg/commitdiff/d4c3a156cb46dcd1f9f97a8011bd94c544079bb5
  • Refactor check_functional_grouping() to use get_primary_key_attnos(). If we ever get around to allowing functional dependency to be proven from other things besides simple primary keys, this code will need to be rethought, but that was true anyway. In the meantime, we might as well not have two very-similar routines for scanning pg_constraint. David Rowley, reviewed by Julien Rouhaud http://git.postgresql.org/pg/commitdiff/f144f73242acef574bc27a4c70e809a64806e4a4
  • Further tweaking of deadlock isolation tests. The new deadlock-soft-2 test has a timing dependency too: it supposes that isolationtester will detect step s1b as waiting before the deadlock detector runs and grants it the lock. Adjust deadlock_timeout to ensure that that's true even in CLOBBER_CACHE_ALWAYS builds, where the wait detection query is quite slow. Per buildfarm member jaguarundi. http://git.postgresql.org/pg/commitdiff/caefc11ef6613683ddf8ded2081da3db238f463e
  • Re-pgindent isolationtester.c. Need to do some more hacking on this, and got annoyed that it's not indent clean. http://git.postgresql.org/pg/commitdiff/a361490806435fda6340fa13c0a881767c57c87a
  • Still further tweaking of deadlock isolation tests. It turns out that there is a second race condition in the new deadlock-hard test: once the deadlock detector fires, it's uncertain whether step s7a8 or step s8a1 will report first, because killing s8's transaction unblocks s7. So far, s7 has only been seen to report first in CLOBBER_CACHE_ALWAYS builds, but it's pretty reproducible there, and in theory it should sometimes occur in normal builds too. If s7 were a bit slower than usual, that could also break the test, since the existing expected-file assumes that we'll see s7a8 report the first time we check it after s8a1 completes. To fix, add a post-lock delay to s7a8. http://git.postgresql.org/pg/commitdiff/d03130d378b5fb071d231a7822784ad87268583a
  • isolationtester: don't repeat the is-it-waiting query when retrying a step. If we're retrying a step, then we already decided it was blocked on a lock, and there's no need to recheck that. The original coding of commit 38f8bdcac4982215beb9f65a19debecaf22fd470 resulted in a large number of is-it-waiting queries when dealing with multiple concurrently-blocked sessions, which is fairly pointless and also results in test failures in CLOBBER_CACHE_ALWAYS builds, where the is-it-waiting query is quite slow. This definition also permits appending pg_sleep() calls to steps where it's needed to control the order of finish of concurrent steps. Before, that did not work nicely because we'd decide that a step performing a sleep was not blocked and hang up waiting for it to finish, rather than noticing the completion of the concurrent step we're supposed to notice first. In passing, revise handling of removal of completed waiting steps to make it a bit less messy. http://git.postgresql.org/pg/commitdiff/9c9782f066e0ce5424b8706df2cce147cb78170f
  • Revert "isolationtester: don't repeat the is-it-waiting query when retrying a step." This mostly reverts commit 9c9782f066e0ce5424b8706df2cce147cb78170f. I left in the parts that rearranged removal of completed waiting steps; but the idea of not rechecking a step's blocked-ness isn't working. http://git.postgresql.org/pg/commitdiff/dca369320f6023b55feb49f281d394181fc57903
  • Revert "Still further tweaking of deadlock isolation tests." This reverts commit d03130d378b5fb071d231a7822784ad87268583a. That was dependent on an isolationtester.c change that now proves to be broken; we will need to find another solution. http://git.postgresql.org/pg/commitdiff/3992188c2a8702bcb92140a840b5378b27468921
  • Increase deadlock_timeout some more in the deadlock-hard isolation test. The previous value of 5s is inadequate for the buildfarm's CLOBBER_CACHE_ALWAYS animals: they take long enough to do the is-it-waiting queries that the timeout expires, allowing the database state to change, before isolationtester is done looking. Perhaps 10s will be enough. (If it isn't, I'm inclined to reduce the number of sessions involved.) http://git.postgresql.org/pg/commitdiff/e84e06d2b3fc48c514fd44f7ac390eb5f3e20d72
  • Add missing "static" qualifier. Per buildfarm member pademelon. http://git.postgresql.org/pg/commitdiff/99a9d6d563f389ad8137984aac13c9c0bd37cb66
  • Make GetLockStatusData's header comment resemble reality. The API spec for this function was changed completely (and for the better) by commit 3cba8999b343648c4c528432ab3d51400194e93b, but it didn't bother with anything as mundane as updating the comments. http://git.postgresql.org/pg/commitdiff/9b92e76f7b6dcdc2de6fae53a1c069297ba454fc

Peter Eisentraut pushed:

Noah Misch pushed:

Andres Freund pushed:

  • Fix overeager pushdown of HAVING clauses when grouping sets are used. In 61444bfb we started to allow HAVING clauses to be fully pushed down into WHERE, even when grouping sets are in use. That turns out not to work correctly, because grouping sets can "produce" NULLs, meaning that filtering in WHERE and HAVING can have different results, even when no aggregates or volatile functions are involved. Instead only allow pushdown of empty grouping sets. It'd be nice to do better, but the exact mechanics of deciding which cases are safe are still being debated. It's important to give correct results till we find a good solution, and such a solution might not be appropriate for backpatching anyway. Bug: #13863 Reported-By: 'wrb' Diagnosed-By: Dean Rasheed Author: Andrew Gierth Reviewed-By: Dean Rasheed and Andres Freund Discussion: 20160113183558.12989.56904@wrigleys.postgresql.org Backpatch: 9.5, where grouping sets were introduced http://git.postgresql.org/pg/commitdiff/a6897efab92bc7e645b6c6d15274b8d61c53fe8f

Joe Conway pushed:

  • Change delimiter used for display of NextXID NextXID has been rendered in the form of a pg_lsn even though it really is not. This can cause confusion, so change the format from %u/%u to %u:%u, per discussion on hackers. Complaint by me, patch by me and Bruce, reviewed by Michael Paquier and Alvaro. Applied to HEAD only. Author: Joe Conway, Bruce Momjian Reviewed-by: Michael Paquier, Alvaro Herrera Backpatch-through: master http://git.postgresql.org/pg/commitdiff/59a884e9854cb3cb7338394fb5f856209b040fb3

Bruce Momjian pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Andreas 'ads' Scherbaum sent in two revisions of a patch to change the 32-bit counter in PL/pgsql's GET DIAGNOSTICS ... ROWCOUNT to 64-bit, allowing larger result sets.

Michaël Paquier sent in a patch to fix psql's tab completion for ALTER FUNCTION.

Alexander Korotkov sent in another revision of a patch to move PinBuffer and UnpinBuffer to atomics.

Artur Zakirov sent in two more revisions of a patch to add fuzzy substring searching with the pg_trgm extension.

Dmitry Ivanov sent in a patch to add phrase search to textsearch.

David Steele sent in a patch to allow hiding messages below ERROR from the client.

Fabien COELHO and Michaël Paquier traded patches to extend pgbench expressions with functions.

Etsuro Fujita sent in a patch to fix a copy-pasto in the ExecForeignDelete documentation.

David Steele sent in another revision of a patch to add a pgaudit extension.

Fabien COELHO and Ãlvaro Herrera traded patches to extend pgbench stats, etc.

Daniel Verité sent in another revision of a patch to create a crosstab view in psql.

Iacob Catalin and Pavel Stěhule traded patches to add an ereport function to PL/PythonU.

Teodor Sigaev sent in another revision of a patch to add tsvector editing functions.

Corey Huinker sent in another revision of a patch to add generate_series(date, date[, integer]).

Kyotaro HORIGUCHI and Fujii Masao traded patches to fix an incorrect formula for SysV IPC parameters.

SAWADA Masahiko and Michaël Paquier traded patches to allow N > 1 synchronous standby servers.

Amit Langote sent in two revisions of a patch to fix a typo in syncrep.c.

Thomas Munro sent in another revision of a patch to add causal reads.

Thomas Munro sent in another revision of a patch to detect SSI conflicts before reporting constraint violations.

Konstantin Knizhnik sent in a patch to allow batch updating of indexes.

Etsuro Fujita and Rushabh Lathia traded patches to speed up updating foreign tables through the PostgreSQL FDW.

Michaël Paquier sent in four more revisions of a patch to fix hot standby checkpoints.

SAWADA Masahiko sent in two more revisions of a patch to add a "frozen" bit to the visibility map.

Heikki Linnakangas sent in a patch to fix the optimization to skip WAL-logging on table created in same xact.

Haribabu Kommi sent in another revision of a patch to create a pg_hba_lookup function to get all matching pg_hba.conf entries.

Kyotaro HORIGUCHI sent in a patch to help improve in-core regression tests.

Michaël Paquier sent in two more revisions of a patch to help fix silent data loss on ext4 filesystems.

Kyotaro HORIGUCHI sent in a patch to add IF (NOT) EXISTS support to tab completion in psql.

Peter Geoghegan sent in another revision of a patch to fix an OpenSSL error queue bug.

Vinayak Pokale sent in another revision of a patch to implement a vacuum progress checker.

Filip Rembiałkowski sent in three revisions of a patch to make NOTIFY list de-duplication optional.

Fabien COELHO sent in another revision of a patch to fix pgbench so it doesn't run much longer than the run length under certain pathological conditions.

Robert Haas sent in a patch to push target-list evaluation down to the parallel worker whenever possible.

Vitaly Burovoy sent in two revisions of a patch to make NOT NULL constraints follow SQL-2011.

Peter Eisentraut sent in another revision of a patch to remove the WAL level "archive."

Thomas Munro sent in another revision of a patch to add a new log line prefix for cluster name.

Fujii Masao sent in another revision of a patch to check for a suitable index when refreshing a materialized view.

Jim Nasby sent in a patch to convert pltcl from strings to objects.

Jeff Janes sent in a patch to add s2k-count to pgcrypto.

Jeff Janes sent in two revisions of a patch to fix a bug in StartupSUBTRANS.

Pavel Stěhule and Jim Nasby traded patches to implement parse an ident to a text array.

Konstantin Knizhnik sent in another revision of a patch to create an extensible transaction manager API.

Stephen Frost sent in a patch to improve docs wrt catalog object ACLs.

Magnus Hagander sent in a patch to update to the backup APIs to support non-exclusive backups.

Vitaly Burovoy sent in another revision of a patch to add pg_size_bytes().

Aleksander Alekseev and Robert Haas traded patches to optimize dynahashes.

Robbie Harwood sent in another revision of a patch to overhaul support for GSSAPI encryption.

Artur Zakirov sent in another revision of a patch to add fuzzy substring searching to the pg_trgm extension.

Fabien COELHO sent in four more revisions of a patch to extend pgbench with functions.

Andreas 'ads' Scherbaum sent in another revision of a patch to allow PL/pgsql's GET DIAGNOSTICS to work with numbers of rows that overflow int32.

Anastasia Lubennikova sent in another revision of a patch to implement covering + unique indexes.

Magnus Hagander sent in a patch to make all the functions PL/pgsql exports are actually prefixed with plpgsql_.

Andres Freund sent in a patch to make SetHintBit() a bit more aggressive, in order to fixes all the potential regressions created by another part of the patch which fixes the overaggressive flushing by the wal writer by only flushing every wal_writer_delay ms or wal_writer_flush_after bytes.

Michaël Paquier sent in another revision of a patch to ensure that GinPageIs* actually return a boolean.

Eugene Kazakov and Michaël Paquier traded patches to fix the TAP tests.

Christian Ullrich sent in another revision of a patch to fix a crash with old Windows on a new CPU.

Magnus Hagander sent in a patch to refactor receivelog.c to be less intertwined with itself.

par N Bougain le lundi 15 février 2016 à 03h46

samedi 6 février 2016

Guillaume Lelarge

Début de la traduction du manuel 9.5

J'ai enfin fini le merge du manuel de la version 9.5. Très peu de temps avant la 9.5, le peu de temps que j'avais étant consacré à mon livre. Mais là, c'est bon, on peut bosser. D'ailleurs, Flavie a déjà commencé et a traduit un paquet de nouveaux fichiers. Mais il reste du boulot. Pour les intéressés, c'est par là : https://github.com/gleu/pgdocs_fr/wiki/Translation-9.5

N'hésitez pas à m'envoyer toute question si vous êtes intéressé pour participer.

par Guillaume Lelarge le samedi 6 février 2016 à 12h18

jeudi 4 février 2016

Rodolphe Quiédeville

Indexer pour rechercher des chaines courtes dans PostgreSQL

Les champs de recherche dans les applications web permettent de se voir propooser des résultats à chaque caractère saisies dans le formulaire, pour éviter de trop solliciter les systèmes de stockage de données, les modules standards permettent de définir une limite basse, la recherche n'étant effective qu'à partir du troisième caractères entrés. Cette limite de 3 caractères s'explique par la possibilité de simplement définir des index trigram dans les bases de données, pour PostgreSQL cela se fait avec l'extension standard pg_trgm, (pour une étude détaillé des trigrams je conseille la lecture de cet article).

Si cette technique a apporté beaucoup de confort dans l'utilisation des formulaires de recherche elle pose néanmoins le problème lorsque que l'on doit rechercher une chaîne de deux caractères, innoportun, contre-productif me direz-vous (je partage assez cet avis) mais imaginons le cas de madame ou monsieur Ba qui sont présent dans la base de données et dont on a oublié de saisir le prénom ou qui n'ont pas de prénom, ils ne pourront jamais remonter dans ces formulaires de recherche, c'est assez fâcheux pour eux.

Nous allons voir dans cet article comment résoudre ce problème, commençons par créer une table avec 50000 lignes de données text aléatoire :

CREATE TABLE blog AS SELECT s, md5(random()::text) as d 
   FROM generate_series(1,50000) s;
~# SELECT * from blog LIMIT 4;
 s |                 d                
---+----------------------------------
 1 | 8fa4044e22df3bb0672b4fe540dec997
 2 | 5be79f21e03e025f00dea9129dc96afa
 3 | 6b1ffca1425326bef7782865ad4a5c5e
 4 | 2bb3d7093dc0fffd5cebacd07581eef0
(4 rows)

Admettons que l'on soit un fan de musique des années 80 et que l'on recherche si il existe dans notre table du texte contenant la chaîne fff.

~# EXPLAIN ANALYZE SELECT * FROM blog WHERE d like '%fff%';

                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Seq Scan on blog  (cost=0.00..1042.00 rows=5 width=37) (actual time=0.473..24.130 rows=328 loops=1)
   Filter: (d ~~ '%fff%'::text)
   Rows Removed by Filter: 49672
 Planning time: 0.197 ms
 Execution time: 24.251 ms
(5 rows)

Sans index on s'en doute cela se traduit pas une lecture séquentielle de la table, ajoutons un index. Pour indexer cette colonne avec un index GIN nous allons utiliser l'opérateur gin_trgm_ops disponible dans l'extension pg_trgm.

~# CREATE EXTENSION pg_trgm;
CREATE EXTENSION
~# CREATE INDEX blog_trgm_idx ON blog USING GIN(d gin_trgm_ops);
CREATE INDEX
~# EXPLAIN ANALYZE SELECT * FROM blog WHERE d like '%fff%';
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on blog  (cost=16.04..34.46 rows=5 width=37) (actual time=0.321..1.336 rows=328 loops=1)
   Recheck Cond: (d ~~ '%fff%'::text)
   Heap Blocks: exact=222
   ->  Bitmap Index Scan on blog_trgm_idx  (cost=0.00..16.04 rows=5 width=0) (actual time=0.176..0.176 rows=328 loops=1)
         Index Cond: (d ~~ '%fff%'::text)
 Planning time: 0.218 ms
 Execution time: 1.451 ms

Cette fois l'index a pu être utilisé, on note au passage que le temps de requête est réduit d'un facteur 20, mais si l'on souhaite désormais rechercher une chaîne de seulement 2 caractères de nouveau une lecture séquentielle a lieu, notre index trigram devient inefficace pour cette nouvelle recherche.

~# EXPLAIN ANALYZE SELECT * FROM blog WHERE d like '%ff%';
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Seq Scan on blog  (cost=0.00..1042.00 rows=3030 width=37) (actual time=0.016..11.712 rows=5401 loops=1)
   Filter: (d ~~ '%ff%'::text)
   Rows Removed by Filter: 44599
 Planning time: 0.165 ms
 Execution time: 11.968 ms

C'est ici que vont intervenir les index bigram, qui comme leur nom l'index travaille sur des couples et non plus des triplets. En premier nous allons tester pgroonga, packagé pour Debian, Ubuntu, CentOS et d'autres systèmes exotiques vous trouverez toutes les explications pour le mettre en place sur la page d'install du projet.

Les versions packagées de la version 1.0.0 ne supportent actuellement que les versions 9.3 et 9.4, mais les sources viennent d'être taguées 1.0.1 avec le support de la 9.5.

CREATE EXTENSION pgroonga;

La création de l'index se fait ensuite en utilisant

~# CREATE INDEX blog_pgroonga_idx ON blog USING pgroonga(d);
CREATE INDEX
~# EXPLAIN ANALYZE SELECT * FROM blog WHERE d like '%ff%';
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on blog  (cost=27.63..482.51 rows=3030 width=37) (actual time=3.721..5.874 rows=2378 loops=1)
   Recheck Cond: (d ~~ '%ff%'::text)
   Heap Blocks: exact=416
   ->  Bitmap Index Scan on blog_pgroonga_idx  (cost=0.00..26.88 rows=3030 width=0) (actual time=3.604..3.604 rows=2378 loops=1)
         Index Cond: (d ~~ '%ff%'::text)
 Planning time: 0.280 ms
 Execution time: 6.230 ms

On retrouve une utilisation de l'index, avec comme attendu un gain de performance.

Autre solution : pg_bigm qui est dédié plus précisément aux index bigram, l'installation se fait soit à partie de paquets RPM, soit directement depuis les sources avec une explication sur le site, claire et détaillée. pg_bigm supporte toutes les versions depuis la 9.1 jusqu'à 9.5.

~# CREATE INDEX blog_bigm_idx ON blog USING GIN(d gin_bigm_ops);
CREATE INDEX
~# EXPLAIN ANALYZE SELECT * FROM blog WHERE d like '%ff%';
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on blog  (cost=35.48..490.36 rows=3030 width=37) (actual time=2.121..5.347 rows=5401 loops=1)
   Recheck Cond: (d ~~ '%ff%'::text)
   Heap Blocks: exact=417
   ->  Bitmap Index Scan on blog_bigm_idx  (cost=0.00..34.73 rows=3030 width=0) (actual time=1.975..1.975 rows=5401 loops=1)
         Index Cond: (d ~~ '%ff%'::text)
 Planning time: 4.406 ms
 Execution time: 6.052 ms

Sur une table de 500k tuples la création de l'index prend 6,5 secondes pour bigm contre 4,8 pour pgroonga ; en terme de lecture je n'ai pas trouvé de pattern avec de réelle différence, bien pgroonga s'annonce plus rapide que pg_bigm, ce premier étant plus récent que le second on peut s'attendre à ce qu'il ait profité de l'expérience du second.

Coté liberté les deux projets sont publiés sous licence PostgreSQL license.

La réelle différence entre les deux projets est que Pgroonga est une sous partie du projet global Groonga qui est dédié à la recherche fulltext, il existe par exemple Mgroonga dont vous devinerez aisément la cible, pg_bigm est lui un projet autonome qui n'implémente que les bigram dans PostgreSQL.

Vous avez désormais deux méthodes pour indexer des 2-gram, prenez garde toutefois de ne pas en abuser.

La version 9.4.5 de PostgreSQL a été utilisée pour la rédaction de cet article.

par Rodolphe Quiédeville le jeudi 4 février 2016 à 08h38

mercredi 3 février 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 31 janvier 2016

"5432 ... Meet us!" aura lieu à Milan (Italie) les 28 & 29 juin 2016. L'appel à conférenciers court jusqu'au 28 février : http://5432meet.us/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en janvier

PostgreSQL Local

  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016 : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France). L'appel à conférenciers court jusqu'au 29 février à l'adresse call-for-paper AT postgresql-sessions POINT org.
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. L'appel à conférenciers est lancé : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160201051108.GA654@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tatsuo Ishii pushed:

Kevin Grittner pushed:

Tom Lane pushed:

  • Improve ResourceOwners' behavior for large numbers of owned objects. The original coding was quite fast so long as objects were always released in reverse order of addition; otherwise, it degenerated into O(N^2) behavior due to searching for the array element to delete. Improve matters by switching to hashed storage when the number of objects of a given type exceeds 64. (The cutover point is open to discussion, of course, but some simple performance testing suggests that hashing has enough overhead to be a loser below there.) Also, refactor resowner.c so that we don't need N copies of the array management code. Since all the resource IDs the code currently needs to deal with are either pointers or integers, it seems sufficient to create a one-size-fits-all infrastructure in which everything is converted to a Datum for storage. Aleksander Alekseev, reviewed by Stas Kelvich, further fixes by me http://git.postgresql.org/pg/commitdiff/cc988fbb0bf60a83b628b5615e6bade5ae9ae6f4
  • Fix startup so that log prefix %h works for the log_connections message. We entirely randomly chose to initialize port->remote_host just after printing the log_connections message, when we could perfectly well do it just before, allowing %h and %r to work for that message. Per gripe from Artem Tomyuk. http://git.postgresql.org/pg/commitdiff/b8682a7155bee06667c5773e1ca6499a670338b0
  • Fix incorrect pattern-match processing in psql's \det command. listForeignTables' invocation of processSQLNamePattern did not match up with the other ones that handle potentially-schema-qualified names; it failed to make use of pg_table_is_visible() and also passed the name arguments in the wrong order. Bug seems to have been aboriginal in commit 0d692a0dc9f0e532. It accidentally sort of worked as long as you didn't inquire too closely into the behavior, although the silliness was later exposed by inconsistencies in the test queries added by 59efda3e50ca4de6 (which I probably should have questioned at the time, but didn't). Per bug #13899 from Reece Hart. Patch by Reece Hart and Tom Lane. Back-patch to all affected branches. http://git.postgresql.org/pg/commitdiff/7e22470471e9ed7010fcbc4a18b0a461d088d7c7

Ãlvaro Herrera pushed:

Fujii Masao pushed:

Robert Haas pushed:

  • Fix cross-version pg_dump for aggregate combine functions. Fixes a defect in commit a7de3dc5c346e07e0439275982569996e645b3c2. David Rowley, per report from Jeff Janes, who also checked that the fix works. http://git.postgresql.org/pg/commitdiff/025b2f339260b727e113a01a20b616a336bff00a
  • Assert that create_unique_path returns non-NULL. Per off-list discussion with Tom Lane and Michael Paquier, Coverity gets unhappy if this is not done. http://git.postgresql.org/pg/commitdiff/eaf7b1f6432480e93d8c6824fbd503761a1c1a4f
  • Add [NO]BYPASSRLS options to CREATE USER and ALTER USER docs. Patch-by: Filip RembiaÅ‚kowski Reviewed-by: Robert Haas Backpatch-through: 9.5 http://git.postgresql.org/pg/commitdiff/80db1ca2d79338c35bb3e01f2aecad78c2231b06
  • Avoid multiple foreign server connections when all use same user mapping. Previously, postgres_fdw's connection cache was keyed by user OID and server OID, but this can lead to multiple connections when it's not really necessary. In particular, if all relevant users are mapped to the public user mapping, then their connection options are certainly the same, so one connection can be used for all of them. While we're cleaning things up here, drop the "server" argument to GetConnection(), which isn't really needed. This saves a few cycles because callers no longer have to look this up; the function itself does, but only when establishing a new connection, not when reusing an existing one. Ashutosh Bapat, with a few small changes by me. http://git.postgresql.org/pg/commitdiff/96198d94cb7adc664bda341842dc8db671d8be72
  • Add missing quotation mark. This fix accidentally got left out of the previous commit. http://git.postgresql.org/pg/commitdiff/2f6b041f76e6de0fa2921131a23bda794ffb83bb
  • Only try to push down foreign joins if the user mapping OIDs match. Previously, the foreign join pushdown infrastructure left the question of security entirely up to individual FDWs, but it would be easy for a foreign data wrapper to inadvertently open up subtle security holes that way. So, make it the core code's job to determine which user mapping OID is relevant, and don't attempt join pushdown unless it's the same for all relevant relations. Per a suggestion from Tom Lane. Shigeru Hanada and Ashutosh Bapat, reviewed by Etsuro Fujita and KaiGai Kohei, with some further changes by me. http://git.postgresql.org/pg/commitdiff/fbe5a3fb73102c2cfec11aaaa4a67943f4474383
  • postgres_fdw: Refactor deparsing code for locking clauses. The upcoming patch to allow join pushdown in postgres_fdw needs to use this code multiple times, which requires moving it to deparse.c. That seems like a good idea anyway, so do that now both on general principle and to simplify the future patch. Inspired by a patch by Shigeru Hanada and Ashutosh Bapat, but I did it a little differently than what that patch did. http://git.postgresql.org/pg/commitdiff/b88ef201d46e6519b5e0589358c952a4c0f5bf0f
  • Migrate PGPROC's backendLock into PGPROC itself, using a new tranche. Previously, each PGPROC's backendLock was part of the main tranche, and the PGPROC just contained a pointer. Now, the actual LWLock is part of the PGPROC. As with previous, similar patches, this makes it significantly easier to identify these lwlocks in LWLOCK_STATS or Trace_lwlocks output and improves modularity. Author: Ildus Kurbangaliev Reviewed-by: Amit Kapila, Robert Haas http://git.postgresql.org/pg/commitdiff/b319356f0e94a6482c726cf4af96597c211d8d6e
  • Migrate replication slot I/O locks into a separate tranche. This is following in a long train of similar changes and for the same reasons - see b319356f0e94a6482c726cf4af96597c211d8d6e and fe702a7b3f9f2bc5bf6d173166d7d55226af82c8 inter alia. Author: Amit Kapila Reviewed-by: Alexander Korotkov, Robert Haas http://git.postgresql.org/pg/commitdiff/2251179e6ad3a865d2f55e1832fab34608fcce43
  • postgres_fdw: More preliminary refactoring for upcoming join pushdown. The code that generates a complete SQL query for a given foreign relation was repeated in two places, and they didn't quite agree: the EXPLAIN case left out the locking clause. Centralize the code so we get the same behavior everywhere, and adjust calling conventions and which functions are static vs. extern accordingly . Centralize the code so we get the same behavior everywhere, and adjust calling conventions and which functions are static vs. extern accordingly. Ashutosh Bapat, reviewed and slightly adjusted by me. http://git.postgresql.org/pg/commitdiff/cc592c48c58d9c1920f8e2063756dcbcce79e4dd

Peter Eisentraut pushed:

Andrew Dunstan pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Michael Paquier sent in a patch to ensure that fsyncs actually fsync in the case of renaming.

Corey Huinker and Michael Paquier traded revisions of a patch to add generate_series(date,date) and generate_series(date,date,integer).

Corey Huinker sent in two more revisions of a patch to allow limiting FETCH by bytes.

Kaigai Kouhei sent in another revision of a patch to add CustomScan support and an example.

Vinayak Pokale sent in two more revisions of a patch to implement a vacuum progress checker.

Amit Kapila sent in two more revisions of a patch to expand pg_stat_activity to chow waiting.

Pavel Stěhule sent in two more revisions of a patch to add pg_size_bytes().

Pavel Stěhule sent in another revision of a patch to add num_notnulls().

Stas Kelvich sent in a patch to speed up two-phase commits by reading state files into memory during the replay of prepare, and if checkpoint/restartpoint occurs between prepare and commit, to move the data to files.

Craig Ringer sent in two more revisions of a patch to implement failover slots.

SAWADA Masahiko sent in two revisions of a patch to check for a suitable index to use in REFRESH MATERIALIZED VIEW CONCURRENTLY.

Kyotaro HORIGUCHI sent in another revision of a patch to allow async-capable nodes to run the node before ExecProcNode().

Pavel Stěhule and Iacob Catalin traded patches to add an ereport function to PL/PythonU.

Anastasia Lubennikova sent in another revision of a patch to add covering unique indexes.

Etsuro Fujita sent in a patch to fix some capitalizations in fdwhandlers.sgml.

Erik Rijkers sent in a patch to fix some typos in the pgbench docs.

Aleksander Alekseev sent in another revision of a patch to optimize dynahashes.

Stas Kelvich sent in another revision of a patch to add tsvector editing functions.

Pavel Stěhule sent in another revision of a patch to better support %TYPE in PL/pgsql.

José Arthur Benetasso Villanova sent in another revision of a patch to log operating system usernames connecting via unix socket.

Fabien COELHO sent in a patch to fix a couple of minor bugs in pgbench.

Ashutosh Bapat sent in a patch to keep from making separate DB connections to remote servers when not needed in FDWs.

Artur Zakirov sent in a patch to fix some infelicities in tsearch2 parsing.

Petr Jelínek sent in another revision of a patch to add a sequence access method.

Peter Eisentraut sent in two more revisions of a patch to integrate better with systemd.

Michael Paquier sent in three more revisions of a patch to avoid unneeded checkpoints.

Noah Misch sent in two revisions of a patch to fix an issue that could cause a backend crash with nested CREATE TEMP TABLE invocations.

Fabien COELHO and Michael Paquier traded patches to extend pgbench expressions with functions.

Amit Kapila sent in a patch to reduce the number of WAL writes.

Dilip Kumar sent in another revision of a patch to help scale relation extensions.

Artur Zakirov sent in two more revisions of a patch to improve Hunspell dictionary support.

Michael Paquier sent in a patch to fix some comment typos.

Etsuro Fujita sent in another revision of a patch to make foreign table writes more efficient.

Kaigai Kouhei sent in two more revisions of a patch to add CustomScan under a Gather node.

Anastasia Lubennikova sent in three more revisions of a patch to compress B-Trees in a way that's analogous to that just used for GIN.

Ashutosh Bapat sent in two more revisions of a patch to make read operations on the PostgreSQL FDW more efficient.

Fujii Masao and SAWADA Masahiko traded patches to fix tab completion options for SET/RESET in psql.

Alexander Korotkov sent in another revision of a patch to enable creating access methods with an example for Bloom filters.

Alexander Korotkov sent in a patch to refactor SLRU tranches.

Ashutosh Bapat sent in a patch to move the code to deparse SELECT statements into a function deparseSelectStmtForRel().

Vitaly Burovoy sent in a patch to fix an overflow in EXTRACT.

Vik Fearing sent in another revision of a patch to add an idle-in-transaction timeout.

Thomas Munro sent in a patch to detect SSI conflicts before reporting constraint violations.

Vitaly Burovoy sent in a patch to make the behavior of all versions of the "isinf" function be similar.

par N Bougain le mercredi 3 février 2016 à 22h29

lundi 1 février 2016

Guillaume Lelarge

Parution de mon livre : "PostgreSQL, architecture et notions avancées"

Après pratiquement deux ans de travail, mon livre est enfin paru. Pour être franc, c'est assez étonnant de l'avoir entre les mains : un vrai livre, avec une vraie reliure et une vraie couverture, écrit par soi. C'est à la fois beaucoup de fierté et pas mal de questionnements sur la façon dont il va être reçu.

Ceci étant dit, sans savoir si le livre sera un succès en soi, c'est déjà pour moi un succès personnel. Le challenge était de pouvoir écrire un livre de 300 pages sur PostgreSQL, le livre que j'aurais aimé avoir entre les mains quand j'ai commencé à utiliser ce SGBD il y a maintenant plus de 15 ans sous l'impulsion de mon ancien patron.

Le résultat est à la hauteur de mes espérances et les premiers retours sont très positifs. Ce livre apporte beaucoup d'explications sur le fonctionnement et le comportement de PostgreSQL qui, de ce fait, n'est plus cette espèce de boîte noire à exécuter des requêtes. La critique rédigée par Jean-Michel Armand dans le GNU/Linux Magazine France numéro 190 est vraiment très intéressante. Je suis d'accord avec son auteur sur le fait que le début est assez ardu : on plonge directement dans la technique, sans trop montrer comment c'est utilisé derrière, en production. Cette partie-là n'est abordée qu'après. C'est une question que je m'étais posée lors de la rédaction, mais cette question est l'éternel problème de l'oeuf et de la poule ... Il faut commencer par quelque chose : soit on explique la base technique (ce qui est un peu rude), puis on finit par montrer l'application de cette base, soit on fait l'inverse. Il n'y a certainement pas une solution meilleure que l'autre. Le choix que j'avais fait me semble toujours le bon, même maintenant. Mais en effet, on peut avoir deux façons de lire le livre : en commençant par le début ou en allant directement dans les chapitres thématiques.

Je suis déjà prêt à reprendre le travail pour proposer une deuxième édition encore meilleure. Cette nouvelle édition pourrait se baser sur la prochaine version majeure de PostgreSQL, actuellement numérotée 9.6, qui comprend déjà des nouveautés très excitantes. Mais cette édition ne sera réellement intéressante qu'avec la prise en compte du retour des lecteurs de la première édition, pour corriger et améliorer ce qui doit l'être. N'hésitez donc pas à m'envoyer tout commentaire sur le livre, ce sera très apprécié.

par Guillaume Lelarge le lundi 1 février 2016 à 17h57

jeudi 28 janvier 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 24 janvier 2016

La 8ème Session PostgreSQL aura lieu le 6 avril 2016 à Lyon (France). L'appel à conférenciers court jusqu'au 29 février à l'adresse call-for-paper AT postgresql-sessions POINT org.

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en janvier

PostgreSQL Local

  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Les inscriptions sont encore ouvertes : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016 : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. L'appel à conférenciers est lancé : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa : http://www.pgcon.org/
  • Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160125053004.GA10201@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Re-pgindent a few files. In preparation for landing index AM interface changes. http://git.postgresql.org/pg/commitdiff/8d290c8ec6c182a4df1d089c21fe84c7912f01fe
  • Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me. http://git.postgresql.org/pg/commitdiff/65c5fcd353a859da9e61bfb2b92a99f12937de3b
  • Add explicit cast to amcostestimate call. My compiler doesn't complain here, but David Rowley's does ... http://git.postgresql.org/pg/commitdiff/49b49506502026a3653bca490c939dc8934afe95
  • Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that. http://git.postgresql.org/pg/commitdiff/9ff60273e35cad6e9d3a4adf59d5c2455afe9d9e
  • Fix assorted inconsistencies in GIN opclass support function declarations. GIN had some minor issues too, mostly using "internal" where something else would be more appropriate. I went with the same approach as in 9ff60273e35cad6e, namely preferring the opclass' indexed datatype for arguments that receive an operator RHS value, even if that's not necessarily what they really are. Again, this is with an eye to having a uniform rule for ginvalidate() to check support function signatures. http://git.postgresql.org/pg/commitdiff/dbe2328959e12701fade6b500ad411271923d6e4
  • Add defenses against putting expanded objects into Const nodes. Putting a reference to an expanded-format value into a Const node would be a bad idea for a couple of reasons. It'd be possible for the supposedly immutable Const to change value, if something modified the referenced variable ... in fact, if the Const's reference were R/W, any function that has the Const as argument might itself change it at runtime. Also, because datumIsEqual() is pretty simplistic, the Const might fail to compare equal to other Consts that it should compare equal to, notably including copies of itself. This could lead to unexpected planner behavior, such as "could not find pathkey item to sort" errors or inferior plans. I have not been able to find any way to get an expanded value into a Const within the existing core code; but Paul Ramsey was able to trigger the problem by writing a datatype input function that returns an expanded value. The best fix seems to be to establish a rule that varlena values being placed into Const nodes should be passed through pg_detoast_datum(). That will do nothing (and cost little) in normal cases, but it will flatten expanded values and thereby avoid the above problems. Also, it will convert short-header or compressed values into canonical format, which will avoid possible unexpected lack-of-equality issues for those cases too. And it provides a last-ditch defense against putting a toasted value into a Const, which we already knew was dangerous, cf commit 2b0c86b66563cf2f. (In the light of this discussion, I'm no longer sure that that commit provided 100% protection against such cases, but this fix should do it.) The test added in commit 65c3d05e18e7c530 to catch datatype input functions with unstable results would fail for functions that returned expanded values; but it seems a bit uncharitable to deem a result unstable just because it's expressed in expanded form, so revise the coding so that we check for bitwise equality only after applying pg_detoast_datum(). That's a sufficient condition anyway given the new rule about detoasting when forming a Const. Back-patch to 9.5 where the expanded-object facility was added. It's possible that this should go back further; but in the absence of clear evidence that there's any live bug in older branches, I'll refrain for now. http://git.postgresql.org/pg/commitdiff/b99551832e79c915e4d877cf0a072120bd248748
  • Suppress compiler warning. Given the limited range of i, these shifts should not cause any problem, but that apparently doesn't stop some compilers from whining about them. David Rowley http://git.postgresql.org/pg/commitdiff/d9b9289c837a98b78b948b597fabd9ab0a96c0db
  • Improve index AMs' opclass validation procedures. The amvalidate functions added in commit 65c5fcd353a859da were on the crude side. Improve them in a few ways: * Perform signature checking for operators and support functions. * Apply more thorough checks for missing operators and functions, where possible. * Instead of reporting problems as ERRORs, report most problems as INFO messages and make the amvalidate function return FALSE. This allows more than one problem to be discovered per run. * Report object names rather than OIDs, and work a bit harder on making the messages understandable. Also, remove a few more opr_sanity regression test queries that are now superseded by the amvalidate checks. http://git.postgresql.org/pg/commitdiff/be44ed27b86ebd165bbedf06a4ac5a8eb943d43c
  • Make extract() do something more reasonable with infinite datetimes. Historically, extract() just returned zero for any case involving an infinite timestamp[tz] input; even cases in which the unit name was invalid. This is not very sensible. Instead, return infinity or -infinity as appropriate when the requested field is one that is monotonically increasing (e.g, year, epoch), or NULL when it is not (e.g., day, hour). Also, throw the expected errors for bad unit names. BACKWARDS INCOMPATIBLE CHANGE Vitaly Burovoy, reviewed by Vik Fearing http://git.postgresql.org/pg/commitdiff/647d87c56ab6da70adb753c08d7cdf7ee905ea8a
  • Remove new coupling between NAMEDATALEN and MAX_LEVENSHTEIN_STRLEN. Commit e529cd4ffa605c6f introduced an Assert requiring NAMEDATALEN to be less than MAX_LEVENSHTEIN_STRLEN, which has been 255 for a long time. Since up to that instant we had always allowed NAMEDATALEN to be substantially more than that, this was ill-advised. It's debatable whether we need MAX_LEVENSHTEIN_STRLEN at all (versus putting a CHECK_FOR_INTERRUPTS into the loop), or whether it has to be so tight; but this patch takes the narrower approach of just not applying the MAX_LEVENSHTEIN_STRLEN limit to calls from the parser. Trusting the parser for this seems reasonable, first because the strings are limited to NAMEDATALEN which is unlikely to be hugely more than 256, and second because the maximum distance is tightly constrained by MAX_FUZZY_DISTANCE (though we'd forgotten to make use of that limit in one place). That means the cost is not really O(mn) but more like O(max(m,n)). Relaxing the limit for user-supplied calls is left for future research; given the lack of complaints to date, it doesn't seem very high priority. In passing, fix confusion between lengths-in-bytes and lengths-in-chars in comments and error messages. Per gripe from Kevin Day; solution suggested by Robert Haas. Back-patch to 9.5 where the unwanted restriction was introduced. http://git.postgresql.org/pg/commitdiff/a396144ac03b0cf337f80201df7e4663cc5a8131
  • Improve levenshtein() docs. Fix chars-vs-bytes confusion here too. Improve poor grammar and markup. http://git.postgresql.org/pg/commitdiff/80aa219146c090d46b599ac40d8d63e30532b622
  • Improve cross-platform consistency of Inf/NaN handling in trig functions. Ensure that the trig functions return NaN for NaN input regardless of what the underlying C library functions might do. Also ensure that an error is thrown for Inf (or otherwise out-of-range) input, except for atan/atan2 which should accept it. All these behaviors should now conform to the POSIX spec; previously, all our popular platforms deviated from that in one case or another. The main remaining platform dependency here is whether the C library might choose to throw a domain error for sin/cos/tan inputs that are large but less than infinity. (Doing so is not unreasonable, since once a single unit-in-the-last-place exceeds PI, there can be no significance at all in the result; however there doesn't seem to be any suggestion in POSIX that such an error is allowed.) We will report such errors if they are reported via "errno", but not if they are reported via "fetestexcept" which is the other mechanism sanctioned by POSIX. Some preliminary experiments with fetestexcept indicated that it might also report errors we could do without, such as complaining about underflow at an unreasonably large threshold. So let's skip that complexity for now. Dean Rasheed, reviewed by Michael Paquier http://git.postgresql.org/pg/commitdiff/fd5200c3dca0bc725f5848eef7ffff538f4479ed
  • Add trigonometric functions that work in degrees. The implementations go to some lengths to deliver exact results for values where an exact result can be expected, such as sind(30) = 0.5 exactly. Dean Rasheed, reviewed by Michael Paquier http://git.postgresql.org/pg/commitdiff/e1bd684a34c11139a1bf4e5200c3bbe59a0fbfad
  • Adjust degree-based trig functions for more portability. The buildfarm isn't very happy with the results of commit e1bd684a34c11139. To try to get the expected exact results everywhere: * Replace M_PI / 180 subexpressions with a precomputed constant, so that the compiler can't decide to rearrange that division with an adjacent operation. Hopefully this will fix failures to get exactly 0.5 from sind(30) and cosd(60). * Add scaling to ensure that tand(45) and cotd(45) give exactly 1; there was nothing particularly guaranteeing that before. * Replace minus zero by zero when tand() or cotd() would output that; many machines did so for tand(180) and cotd(270), but not all. We could alternatively deem both results valid, but that doesn't seem likely to be what users will want. http://git.postgresql.org/pg/commitdiff/73193d82d7c8d849774bf6952dfb4287e213c572
  • Further adjust degree-based trig functions for more portability. The last round didn't do it. Per Noah Misch, the problem on at least some machines is that the compiler pre-evaluates trig functions having constant arguments using code slightly different from what will be used at runtime. Therefore, we must prevent the compiler from seeing constant arguments to any of the libm trig functions used in this code. The method used here might still fail if init_degree_constants() gets inlined into the call sites. That probably won't happen given the large number of call sites; but if it does, we could probably fix it by making init_degree_constants() non-static. I'll avoid that till proven necessary, though. http://git.postgresql.org/pg/commitdiff/65abaab547a5758b0d6d92df4af1663bb47d545f
  • Still further adjust degree-based trig functions for more portability. Indeed, the non-static declaration foreseen in my previous commit message is necessary. Per Noah Misch. http://git.postgresql.org/pg/commitdiff/360f67d31a5656991122b89c9ca22a860f41512c
  • Yet further adjust degree-based trig functions for more portability. Buildfarm member cockatiel is still saying that cosd(60) isn't 0.5. What seems likely is that the subexpression (1.0 - cos(x)) isn't being rounded to double width before more arithmetic is done on it, so force that by storing it into a variable. http://git.postgresql.org/pg/commitdiff/00347575e2754b1aaacd357776560803564d3f35

Tatsuo Ishii pushed:

Andrew Dunstan pushed:

  • Remove Cygwin-specific code from pg_ctl This code has been there for a long time, but it's never really been needed. Cygwin has its own utility for registering, unregistering, stopping and starting Windows services, and that's what's used in the Cygwin postgres packages. So now pg_ctl for Cygwin looks like it is for any Unix platform. Michael Paquier and me http://git.postgresql.org/pg/commitdiff/53c949c1be2f43cd47cb433923e76ea00e9222bc

Ãlvaro Herrera pushed:

Bruce Momjian pushed:

Robert Haas pushed:

  • Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me. http://git.postgresql.org/pg/commitdiff/a7de3dc5c346e07e0439275982569996e645b3c2
  • Support parallel joins, and make related improvements. The core innovation of this patch is the introduction of the concept of a partial path; that is, a path which if executed in parallel will generate a subset of the output rows in each process. Gathering a partial path produces an ordinary (complete) path. This allows us to generate paths for parallel joins by joining a partial path for one side (which at the baserel level is currently always a Partial Seq Scan) to an ordinary path on the other side. This is subject to various restrictions at present, especially that this strategy seems unlikely to be sensible for merge joins, so only nested loops and hash joins paths are generated. This also allows an Append node to be pushed below a Gather node in the case of a partitioned table. Testing revealed that early versions of this patch made poor decisions in some cases, which turned out to be caused by the fact that the original cost model for Parallel Seq Scan wasn't very good. So this patch tries to make some modest improvements in that area. There is much more to be done in the area of generating good parallel plans in all cases, but this seems like a useful step forward. Patch by me, reviewed by Dilip Kumar and Amit Kapila. http://git.postgresql.org/pg/commitdiff/45be99f8cd5d606086e0a458c9c72910ba8a613d

Simon Riggs pushed:

  • Refactor to create generic WAL page read callback. Previously we didn’t have a generic WAL page read callback function, surprisingly. Logical decoding has logical_read_local_xlog_page(), which was actually generic, so move that to xlogfunc.c and rename to read_local_xlog_page(). Maintain logical_read_local_xlog_page() so existing callers still work. As requested by Michael Paquier, Alvaro Herrera and Andres Freund http://git.postgresql.org/pg/commitdiff/422a55a68784fd00f4514834f3649140a9166fa5
  • Speedup 2PC by skipping two phase state files in normal path. 2PC state info is written only to WAL at PREPARE, then read back from WAL at COMMIT PREPARED/ABORT PREPARED. Prepared transactions that live past one bufmgr checkpoint cycle will be written to disk in the same form as previously. Crash recovery path is not altered. Measured performance gains of 50-100% for short 2PC transactions by completely avoiding writing files and fsyncing. Other optimizations still available, further patches in related areas expected. Stas Kelvich and heavily edited by Simon Riggs Based upon earlier ideas and patches by Michael Paquier and Heikki Linnakangas, a concrete example of how Postgres-XC has fed back ideas into PostgreSQL. Reviewed by Michael Paquier, Jeff Janes and Andres Freund Performance testing by Jesper Pedersen http://git.postgresql.org/pg/commitdiff/978b2f65aa1262eb4ecbf8b3785cb1b9cf4db78e
  • Refactor headers to split out standby defs Jeff Janes http://git.postgresql.org/pg/commitdiff/c80b31d557cb4b2d2a65cb0a7e71fd961834fdb2
  • Correct comment in GetConflictingVirtualXIDs() We use Share lock because it is safe to do so. http://git.postgresql.org/pg/commitdiff/1129c2b0ad2732f301f696ae2cf98fb063a4c1f8

Peter Eisentraut pushed:

Fujii Masao pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Alexander Shulgin sent in a patch to fix some IDENTIFICATION divisions.

Dmitry Dolgov sent in another revision of a patch to add array-style subscripting to JSONB.

Tomas Vondra sent in another revision of a patch to add multivariate statistics.

Bruce Momjian and Joe Conway traded patches to expose pg_controldata and pg_config as functions.

Artur Zakirov sent in two more revisions of a patch to implement fuzzy substring searching with the pg_trgm extension.

Ashutosh Bapat sent in two more revisions of a patch to add postgres_fdw join pushdown.

Vitaly Burovoy and Pavel Stěhule traded patches to add a custom function for converting human readable sizes to bytes.

Dilip Kumar sent in another revision of a patch to move PinBuffer and UnpinBuffer to atomics.

Thomas Munro sent in another revision of a patch to add causal reads.

SAWADA Masahiko sent in another revision of a patch to allow multiple synchronous standby servers.

Alexander Shulgin sent in another revision of a patch to add an extension called pg_logical_slot_stream_relation.

Tatsuo Ishii sent in a patch to fix a too-enthusiastic quoting of identifiers with the high bit set.

Anastasia Lubennikova sent in two more revisions of a patch to implement covering unique indexes.

Robert Haas and Etsuro Fujita traded patches to optimize write operations on the PostgreSQL FDW.

Haribabu Kommi sent in two more revisions of a patch to do aggregation in parallel.

David Rowley sent in another revision of a patch to serialize internal aggregate states.

Haribabu Kommi sent in two more revisions of a patch to add combine functions for staged aggregates.

Kyotaro HORIGUCHI sent in another revision of a patch to allow async-capable nodes to register callbacks to run the node before ExecProcNode() and provide an example using same.

Etsuro Fujita sent in another revision of a patch to optimize create_foreignscan_plan/ExecInitForeignScan.

Daniel Verité sent in another revision of a patch to add a \crosstabview to psql.

Tomas Vondra and David Rowley traded patches to optimize outer joins where the outer side is unique.

Petr Jelínek sent in another revision of a patch to enable generic WAL logical messages.

David Rowley sent in a patch to fix an issue where combining aggretates didn't work with pg_dump.

Victor Wagner sent in another revision of a patch to implement failover on the libpq connect level.

Aleksander Alekseev sent in another revision of a patch to optimize dynahashes.

Tomas Vondra sent in another revision of a patch to ensure that more predictable column statistics are being gathered and used.

Fabien COELHO sent in another revision of a patch to add better logging, etc. to pgbench.

David Rowley sent in another revision of a patch to remove functionally dependent GROUP BY columns.

Alexander Korotkov sent in another revision of a patch to add partial sorts.

Pavel Stěhule sent in another revision of a patch to add a parse_ident() function.

Artur Zakirov sent in another revision of a patch to copy regexp_t.

par N Bougain le jeudi 28 janvier 2016 à 21h58

jeudi 21 janvier 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 17 janvier 2016

La LinuxFest Northwest aura lieu les 23 et 24 avril 2016 au Collège Technique de Bellingham (Washington, USA). L'appel à conférenciers est maintenant lancé : http://www.linuxfestnorthwest.org/2016/present

Le PGDay suisse sera, cette année, tenue à l'Université des Sciences Appliquées (HSR) de Rapperswil le 24 juin 2016. L'appel à conférenciers est lancé : http://www.pgday.ch/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en janvier

PostgreSQL Local

  • PostgreSQL@SCaLE est un événement de deux jours avec deux programmes qui aura lieu les 21 et 22 janvier 2016 au Pasadena Convention Center à l'occasion du SCaLE 14X. https://www.socallinuxexpo.org/scale/14x/cfp
  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Les inscriptions sont encore ouvertes : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016 : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. L'appel à conférenciers est lancé : https://2016.foss4g-na.org/cfp
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa. L'appel à conférenciers est toujours ouvert : http://www.pgcon.org/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160118004136.GB13585@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Peter Eisentraut pushed:

Robert Haas pushed:

Tom Lane pushed:

  • Avoid dump/reload problems when using both plpython2 and plpython3. Commit 803716013dc1350f installed a safeguard against loading plpython2 and plpython3 at the same time, but asserted that both could still be used in the same database, just not in the same session. However, that's not actually all that practical because dumping and reloading will fail (since both libraries necessarily get loaded into the restoring session). pg_upgrade is even worse, because it checks for missing libraries by loading every .so library mentioned in the entire installation into one session, so that you can have only one across the whole cluster. We can improve matters by not throwing the error immediately in _PG_init, but only when and if we're asked to do something that requires calling into libpython. This ameliorates both of the above situations, since while execution of CREATE LANGUAGE, CREATE FUNCTION, etc will result in loading plpython, it isn't asked to do anything interesting (at least not if check_function_bodies is off, as it will be during a restore). It's possible that this opens some corner-case holes in which a crash could be provoked with sufficient effort. However, since plpython only exists as an untrusted language, any such crash would require superuser privileges, making it "don't do that" not a security issue. To reduce the hazards in this area, the error is still FATAL when it does get thrown. Per a report from Paul Jones. Back-patch to 9.2, which is as far back as the patch applies without work. (It could be made to work in 9.1, but given the lack of previous complaints, I'm disinclined to expend effort so far back. We've been pretty desultory about support for Python 3 in 9.1 anyway.) http://git.postgresql.org/pg/commitdiff/866566a690bb9916dcd294807e65a6e173396530
  • Use LOAD not actual code execution to pull in plpython library. Commit 866566a690bb9916 is insufficient to prevent dump/reload failures when using transform modules in a database with both plpython2 and plpython3 installed. The reason is that the transform extension scripts use DO blocks as a mechanism to pull in the libpython library before creating the transform function. It's necessary to preload the library because the dynamic loader won't do it for us on every platform, leading to "unresolved symbol" failures when the transform library is loaded. But it's *not* necessary to execute Python code, and doing so will provoke a multiple-Pythons-are-loaded error even after the preceding commit. To fix, use LOAD instead of a DO block. That requires superuser privilege, but creation of a C function does anyway. It also embeds knowledge of the underlying library name for each PL language; but that's wired into the initdb-time contents of pg_pltemplate too, so that doesn't seem like a large problem either. Note that CREATE TRANSFORM as such doesn't call the language module at all. Per a report from Paul Jones. Back-patch to 9.5 where transform modules were introduced. http://git.postgresql.org/pg/commitdiff/fb6fcbd33fbbd6d31fa2b39938e60ecb48dc4de4
  • Remove no-longer-needed old-style check for incompatible plpythons. Commit 866566a690bb9916 introduced a new mechanism for incompatible plpythons to detect each other. I left the old mechanism in place, because it seems possible that a plpython predating that commit might be used with one postdating it. (This would require updating plpython3 but not plpython2 or vice versa, but that seems well within the realm of possibility.) However, surely it will not be able to happen in 9.6 or later, so we can delete the old mechanism in HEAD. http://git.postgresql.org/pg/commitdiff/796d1e889f2b5f88b33a425fdfd08d7906cbd66a
  • Run pgindent on src/bin/pg_dump/* To ease doing indent fixups on a couple of patches I have in progress. http://git.postgresql.org/pg/commitdiff/26905e009babe6020fddcf3820e57e2f87c5539c
  • Access pg_dump's options structs through Archive struct, not directly. Rather than passing around DumpOptions and RestoreOptions as separate arguments, add fields to struct Archive to carry pointers to these objects, and access them through those fields when needed. There already was a RestoreOptions pointer in Archive, though for no obvious reason it was part of the "private" struct rather than out where pg_dump.c could see it. Doing this allows reversion of quite a lot of parameter-addition changes made in commit 0eea8047bf, which is a good thing IMO because this will reduce the code delta between 9.4 and 9.5, probably easing a few future back-patch efforts. Moreover, the previous commit only added a DumpOptions argument to functions that had to have it at the time, which means we could anticipate still more code churn (and more back-patch hazard) as the requirement spread further. I'd hit exactly that problem in my upcoming patch to fix extension membership marking, which is what motivated me to do this. http://git.postgresql.org/pg/commitdiff/5b5fea2a11741e651f7c25e981dd29b610a08426
  • Handle extension members when first setting object dump flags in pg_dump. pg_dump's original approach to handling extension member objects was to run around and clear (or set) their dump flags rather late in its data collection process. Unfortunately, quite a lot of code expects those flags to be valid before that; which was an entirely reasonable expectation before we added extensions. In particular, this explains Karsten Hilbert's recent report of pg_upgrade failing on a database in which an extension has been installed into the pg_catalog schema. Its objects are initially marked as not-to-be-dumped on the strength of their schema, and later we change them to must-dump because we're doing a binary upgrade of their extension; but we've already skipped essential tasks like making associated DO_SHELL_TYPE objects. To fix, collect extension membership data first, and incorporate it in the initial setting of the dump flags, so that those are once again correct from the get-go. This has the undesirable side effect of slightly lengthening the time taken before pg_dump acquires table locks, but testing suggests that the increase in that window is not very much. Along the way, get rid of ugly special-case logic for deciding whether to dump procedural languages, FDWs, and foreign servers; dump decisions for those are now correct up-front, too. In 9.3 and up, this also fixes erroneous logic about when to dump event triggers (basically, they were *always* dumped before). In 9.5 and up, transform objects had that problem too. Since this problem came in with extensions, back-patch to all supported versions. http://git.postgresql.org/pg/commitdiff/e72d7d85310c397a94748db72d73a59c57e0b0dc
  • Fix build_grouping_chain() to not clobber its input lists. There's no good reason for stomping on the input data; it makes the logic in this function no simpler, in fact probably the reverse. And it makes it impossible to separate path generation from plan generation, as I'm working towards doing; that will require more than one traversal of these lists. http://git.postgresql.org/pg/commitdiff/a923af382c5678f3dfb591aacb6b90bf4e5ed7a9
  • Remove dead code in pg_dump. Coverity quite reasonably complained that this check for fout==NULL occurred after we'd already dereferenced fout. However, the check is just dead code since there is no code path by which CreateArchive can return a null pointer. Errors such as can't-open-that-file are reported down inside CreateArchive, and control doesn't return. So let's silence the warning by removing the dead code, rather than continuing to pretend it does something. Coverity didn't complain about this before 5b5fea2a1, so back-patch to 9.5 like that patch. http://git.postgresql.org/pg/commitdiff/57ce9acc04483df4913921d4ff21f01483583fb8

Simon Riggs pushed:

Magnus Hagander pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Simon Riggs sent in another revision of a patch to speed up 2PC.

Marisa Emerson sent in another revision of a patch to support BSD Authentication.

Elvis Pranskevichus sent in a patch to fix an issue with dumping domain constraints in pg_dump.

Michael Paquier and Fabien COELHO traded patches to extend pgbench expressions with functions.

Dilip Kumar sent in another revision of a patch to help make relation extension more scalable.

Etsuro Fujita sent in another revision of a patch to help make FDW DML pushdown more efficient.

SAWADA Masahiko sent in two more revisions of a patch to add a "frozen" bit to the visibility map.

Anastasia Lubennikova sent in two more revisions of a patch to implement covering unique indexes.

Amit Kapila sent in a patch to optimize parallelism code for launched workers usage.

Michael Paquier sent in a patch to remove the service-related code in pg_ctl for Cygwin.

Tom Lane sent in a patch to show what might be needed to fix PL/Python[23] issues in PostgreSQL 9.1. In view of 9.1's short remaining life, this is not going to be applied.

David Rowley sent in two more revisions of a patch to remove functionally dependent GROUP BY columns.

Filip Rembiałkowski sent in a patch to document the fact that BYPASSRLS is a CREATE USER option.

Andres Freund sent in a patch to allow to easily choose between the readiness primitives in unix_latch.c, error out if waiting on socket readiness without a specified socket, only clear unix_latch.c's self-pipe if it actually contains data, and support using epoll as the polling primitive in unix_latch.c.

Amit Langote sent in a patch to fix a comment thinko in expand_inherited_rtentry().

Vinayak Pokale sent in a patch to fix a typo in sequence.c.

Tatsuro Yamada sent in a patch to fix a comment typo in port/atomics/generic.h.

David Rowley sent in two more revisions of a patch to implement combining aggregates.

Mithun C Y sent in a patch to cache snapshot data avoid cuncurrent write to cache.

Etsuro Fujita sent in another revision of a patch to create foreign scan plan.

Etsuro Fujita sent in a patch to update a comment in setrefs.c.

Etsuro Fujita sent in a patch to make a minor documentation tweak to GetForeignPlan documentation.

Alexander Shulgin sent in a POC patch to create pg_logical_slot_stream_relation.

Christian Ullrich sent in another revision of a patch to fix an error in SSPI auth that could cause the wrong realm name to be used.

Jeff Janes sent in another revision of a patch to expose the GIN clean pending list to SQL.

Constantin S. Pan sent in a patch to speed up GIN build with parallel workers.

Andreas Seltenreich sent in a patch to improve spinlock inline assembly for x86.

José Arthur Benetasso Villanova sent in a patch to log operating system user connecting via unix socket.

Dean Rasheed sent in another revision of a patch to implement trigonometric functions in degrees.

par N Bougain le jeudi 21 janvier 2016 à 01h19

mercredi 13 janvier 2016

Sébastien Lardière

Version 9.5 de PostgreSQL - 3

Une nouvelle version majeure de PostgreSQL est disponible depuis le 7 janvier. Chacune des versions de PostgreSQL ajoute son lot de fonctionnalités, à la fois pour le développeur et l'administrateur. Cette version apporte de nombreuses fonctionnalités visant à améliorer les performances lors du requêtage de gros volumes de données.

Cette présentation en trois billets introduit trois types de fonctionnalités :

Ce dernier billet de la série liste quelques paramètres de configuration qui font leur apparition dans cette nouvelle version.Suivi de l'horodatage des COMMITLe paramètre track_commit_timestamp permet de marquer dans les journaux de transactions ("WAL") chaque validation ("COMMIT") avec la date et l'heure du serveur. Ce paramètre est un booléen, et... Lire Version 9.5 de PostgreSQL - 3

par Sébastien Lardière le mercredi 13 janvier 2016 à 15h12

Rodolphe Quiédeville

Index multi colonnes GIN, GIST

Ce billet intéressera tous les utilisateurs de colonnes de type hstore ou json avec PostgreSQL. Bien que celui-ci prenne pour exemple hstore il s'applique également aux colonnes json ou jsonb.

Commençons par créer une table et remplissons là avec 100 000 lignes de données aléatoires. Notre exemple représente des articles qui sont associés à un identifiant de langue (lang_id) et des tags catégorisés (tags), ici chaque article peut être associé à un pays qui sera la Turquie ou l'Islande.

~# CREATE TABLE article (id int4, lang_id int4, tags hstore);
CREATE TABLE
~# INSERT INTO article 
SELECT generate_series(1,10e4::int4), cast(random()*20 as int),
CASE WHEN random() > 0.5 
THEN 'country=>Turquie'::hstore 
WHEN random() > 0.8 THEN 'country=>Islande' ELSE NULL END AS x;
INSERT 0 100000

Pour une recherche efficace des articles dans une langue donnée nous ajountons un index de type B-tree sur la colonne lang_id et un index de type GIN sur la colonne tags.

~# CREATE INDEX ON article(lang_id);
CREATE INDEX
~# CREATE INDEX ON article USING GIN (tags);
CREATE INDEX

Nous avons maintenant nos données et nos index, nous pouvons commencer les recherches. Recherchons tous les articles écrit en français (on considère que l'id du français est le 17), qui sont associés à un pays (ils ont un tag country), et analysons le plan d'exécution.

~# EXPLAIN ANALYZE SELECT * FROM article WHERE lang_id=17 AND tags ? 'country';
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on article  (cost=122.42..141.21 rows=5 width=35) (actual time=12.348..13.912 rows=3018 loops=1)
   Recheck Cond: ((tags ? 'country'::text) AND (lang_id = 17))
   Heap Blocks: exact=663
   ->  BitmapAnd  (cost=122.42..122.42 rows=5 width=0) (actual time=12.168..12.168 rows=0 loops=1)
         ->  Bitmap Index Scan on article_tags_idx  (cost=0.00..12.75 rows=100 width=0) (actual time=11.218..11.218 rows=60051 loops=1)
               Index Cond: (tags ? 'country'::text)
         ->  Bitmap Index Scan on article_lang_id_idx  (cost=0.00..109.42 rows=4950 width=0) (actual time=0.847..0.847 rows=5016 loops=1)
               Index Cond: (lang_id = 17)
 Planning time: 0.150 ms
 Execution time: 14.111 ms
(10 rows)

On a logiquement 2 parcours d'index, suivis d'une opération de combinaison pour obtenir le résultat final. Pour gagner un peu en performance on penserait naturellement à créer un index multi colonnes qui contienne lang_id et tags, mais si vous avez déjà essayé de le faire vous avez eu ce message d'erreur :

~# CREATE INDEX ON article USING GIN (lang_id, tags);
ERROR:  42704: data type integer has no default operator class for access method "gin"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.
LOCATION:  GetIndexOpClass, indexcmds.c:1246

Le HINT donnne une piste intéressante, en effet les index de type GIN ne peuvent pas s'appliquer sur les colonnes de type int4 (et bien d'autres).

La solution réside dans l'utilisation d'une extension standard, qui combine les opérateurs GIN et B-tree, btree-gin, précisons tout de suite qu'il existe l'équivalent btree-gist.

Comme toute extension elle s'installe aussi simplement que :

~# CREATE EXTENSION btree_gin;
CREATE EXTENSION

Désormais nous allons pouvoir créer notre index multi-colonne et rejouer notre requête pour voir la différence.

~# CREATE INDEX ON article USING GIN (lang_id, tags);
CREATE INDEX
~# EXPLAIN ANALYZE SELECT * FROM article WHERE lang_id=17 AND tags ? 'country';
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on article  (cost=24.05..42.84 rows=5 width=35) (actual time=1.983..3.777 rows=3018 loops=1)
   Recheck Cond: ((lang_id = 17) AND (tags ? 'country'::text))
   Heap Blocks: exact=663
   ->  Bitmap Index Scan on article_lang_id_tags_idx  (cost=0.00..24.05 rows=5 width=0) (actual time=1.875..1.875 rows=3018 loops=1)
         Index Cond: ((lang_id = 17) AND (tags ? 'country'::text))
 Planning time: 0.211 ms
 Execution time: 3.968 ms
(7 rows)

A la lecture de ce deuxième explain le gain est explicite, même avec un petit jeu de données le coût estimé est divisé par 3, l'on gagne une lecture d'index et une opération de composition. Maintenant nous pouvons supprimer les 2 autres index pour ne conserver que celui-ci.

par Rodolphe Quiédeville le mercredi 13 janvier 2016 à 14h11

mardi 12 janvier 2016

Sébastien Lardière

Version 9.5 de PostgreSQL - 2

Une nouvelle version majeure de PostgreSQL est disponible depuis le 7 janvier. Chacune des versions de PostgreSQL ajoute son lot de fonctionnalités, à la fois pour le développeur et l'administrateur. Cette version apporte de nombreuses fonctionnalités visant à améliorer les performances lors du requêtage de gros volumes de données.

Cette présentation en trois billets introduit trois types de fonctionnalités :

Ce deuxième billet évoque donc les nouvelles méthodes internes du moteur, c'est-à-dire de nouveaux outils dont PostgreSQL dispose pour traiter les données.Index BRINIl s'agit d'un nouveau type d'index, créé pour résoudre des problèmes d'accès à de très gros volumes de données ; le cas d'usage est une table de log, dans laquelle les données sont... Lire Version 9.5 de PostgreSQL - 2

par Sébastien Lardière le mardi 12 janvier 2016 à 10h45

lundi 11 janvier 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 10 janvier 2016

PostgreSQL 9.5 est disponible ! http://www.postgresql.org/docs/current/static/release-9-5.html https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5
[ndt: version francophone : http://blog.postgresql.fr/index.php?post/PostgreSQL-9-5]

FOSS4G NA (Free and Open Source Software for Geospatial - North America) se tiendra à Raleigh, en Caroline du Nord, du 2 au 5 mai 2016. L'appel à conférenciers est lancé : https://2016.foss4g-na.org/cfp

PostgreSQL@SCaLE aura lieu les 21 & 22 janvier 2016 au centre des conventions de Pasadena : https://reg.socallinuxexpo.org/reg6/

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en janvier

PostgreSQL Local

  • PostgreSQL@SCaLE est un événement de deux jours avec deux programmes qui aura lieu les 21 et 22 janvier 2016 au Pasadena Convention Center à l'occasion du SCaLE 14X. https://www.socallinuxexpo.org/scale/14x/cfp
  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Les inscriptions sont encore ouvertes : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016. L'appel à conférenciers est lancé : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour. L'appel à conférenciers est ouvert : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. Les inscriptions sont encore ouvertes : http://2016.nordicpgday.org/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa. L'appel à conférenciers a été lancé : http://www.pgcon.org/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160110232921.GA3645@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Tom Lane pushed:

  • Do some copy-editing on the docs for row-level security. Clarifications, markup improvements, corrections of misleading or outright wrong statements. http://git.postgresql.org/pg/commitdiff/c1611db01fec587525e88270854c4b993846dcb3
  • Fix bogus lock release in RemovePolicyById and RemoveRoleFromObjectPolicy. Can't release the AccessExclusiveLock on the target table until commit. Otherwise there is a race condition whereby other backends might service our cache invalidation signals before they can actually see the updated catalog rows. Just to add insult to injury, RemovePolicyById was closing the rel (with incorrect lock drop) and then passing the now-dangling rel pointer to CacheInvalidateRelcache. Probably the reason this doesn't fall over on CLOBBER_CACHE buildfarm members is that some outer level of the DROP logic is still holding the rel open ... but it'd have bit us on the arse eventually, no doubt. http://git.postgresql.org/pg/commitdiff/f47b602df80d7647ca2e71c86f7228b1bf5bf9f3
  • Fix regrole and regnamespace types to honor quoting like other reg* types. Aside from any consistency arguments, this is logically necessary because the I/O functions for these types also handle numeric OID values. Without a quoting rule it is impossible to distinguish numeric OIDs from role or namespace names that happen to contain only digits. Also change the to_regrole and to_regnamespace functions to dequote their arguments. While not logically essential, this seems like a good idea since the other to_reg* functions do it. Anyone who really wants raw lookup of an uninterpreted name can fall back on the time-honored solution of (SELECT oid FROM pg_namespace WHERE nspname = whatever). Report and patch by Jim Nasby, reviewed by Michael Paquier http://git.postgresql.org/pg/commitdiff/fb1227af67eae5e97795f7e3563673c6e67d2844
  • Fix regrole and regnamespace output functions to do quoting, too. We discussed this but somehow failed to implement it... http://git.postgresql.org/pg/commitdiff/b0cadc08fea564f75a0702e15b2bd949377bd2f3
  • Adjust behavior of row_security GUC to match the docs. Some time back we agreed that row_security=off should not be a way to bypass RLS entirely, but only a way to get an error if it was being applied. However, the code failed to act that way for table owners. Per discussion, this is a must-fix bug for 9.5.0. Adjust the logic in rls.c to behave as expected; also, modify the error message to be more consistent with the new interpretation. The regression tests need minor corrections as well. Also update the comments about row_security in ddl.sgml to be correct. (The official description of the GUC in config.sgml is already correct.) I failed to resist the temptation to do some other very minor cleanup as well, such as getting rid of a duplicate extern declaration. http://git.postgresql.org/pg/commitdiff/5d35438273c4523a4dc4b48c3bd575e64310d3d4
  • Docs: provide a concrete discussion and example for RLS race conditions. Commit 43cd468cf01007f3 added some wording to create_policy.sgml purporting to warn users against a race condition of the sort that had been noted some time ago by Peter Geoghegan. However, that warning was far too vague to be useful (or at least, I completely failed to grasp what it was on about). Since the problem case occurs with a security design pattern that lots of people are likely to try to use, we need to be as clear as possible about it. Provide a concrete example in the main-line docs in place of the original warning. http://git.postgresql.org/pg/commitdiff/7debf36072b3a088b3003aab6dcf57c3f186100d
  • In psql's tab completion, change most TailMatches patterns to Matches. In the refactoring in commit d37b816dc9e8f976c8913296781e08cbd45c5af1, we mostly kept to the original design whereby only the last few words on the line were matched to identify a completable pattern. However, after commit d854118c8df8c413d069f7e88bb01b9e18e4c8ed, there's really no reason to do it like that: where it's sensible, we can use patterns that expect to match the entire input line. And mostly, it's sensible. Matching the entire line greatly reduces the odds of a false match that leads to offering irrelevant completions. Moreover (though I've not tried to measure this), it should make tab completion faster since many of the patterns will be discarded after a single integer comparison that finds that the wrong number of words appear on the line. There are certain identifiable places where we still need to use TailMatches because the statement in question is allowed to appear embedded in a larger statement. These are just a small minority of the existing patterns, though, so the benefit of switching where possible is large. It's possible that this patch has removed some within-line matching behaviors that are in fact desirable, but we can put those back when we get complaints. Most of the removed behaviors are certainly silly. Michael Paquier, with some further adjustments by me http://git.postgresql.org/pg/commitdiff/9b181b0363deb65b15a9feaf3eb74f86707498a9
  • Convert psql's tab completion for backslash commands to the new style. This requires adding some more infrastructure to handle both case-sensitive and case-insensitive matching, as well as the ability to match a prefix of a previous word. So it ends up being about a wash line-count-wise, but it's just as big a readability win here as in the SQL tab completion rules. Michael Paquier, some adjustments by me http://git.postgresql.org/pg/commitdiff/4f18010af126f126824e01eec2285e6263d98b3d
  • Add to_regnamespace() and to_regrole() to the documentation. Commits cb9fa802b32b222b and 0c90f6769de6a60f added these functions, but did not bother with documentation. http://git.postgresql.org/pg/commitdiff/83be1844acdcb0cbff31369a65ec61d588fbe9f3
  • Make the to_reg*() functions accept text not cstring. Using cstring as the input type was a poor decision, because that's not really a full-fledged type. In particular, it lacks implicit coercions from text or varchar, meaning that usages like to_regproc('foo'||'bar') wouldn't work; basically the only case that did work without explicit casting was a simple literal constant argument. The lack of field complaints about this suggests that hardly anyone is using these functions, so hopefully fixing it won't cause much of a compatibility problem. They've only been there since 9.4, anyway. Petr Korobeinikov http://git.postgresql.org/pg/commitdiff/ea0d494dae0d3d6fce26bf5d6fbaa07e2ee6c402
  • In opr_sanity regression test, check for unexpected uses of cstring. In light of commit ea0d494dae0d3d6f, it seems like a good idea to add a regression test that will complain about random functions taking or returning cstring. Only I/O support functions and encoding conversion functions should be declared that way. While at it, add some checks that encoding conversion functions are declared properly. Since pg_conversion isn't populated manually, it's not quite as necessary to check its contents as it is for catalogs like pg_proc; but one thing we definitely have not tested in the past is whether the identified conproc for a conversion actually does that conversion vs. some other one. http://git.postgresql.org/pg/commitdiff/921191912c48a68db81c02c02f3bc22e291d918c
  • Sort $(wildcard) output where needed for reproducible build output. The order of inclusion of .o files makes a difference in linker output; not a functional difference, but still a bitwise difference, which annoys some packagers who would like reproducible builds. Report and patch by Christoph Berg http://git.postgresql.org/pg/commitdiff/3343ea9e8ea4f552b3f6e5436938f2f0e153b947
  • Remove some ancient and unmaintained encoding-conversion test cruft. In commit 921191912c48a68d I claimed that we weren't testing encoding conversion functions, but further poking around reveals that we did have an equivalent though hard-wired set of tests in conversion.sql. AFAICS there is no advantage to doing it like that as compared to letting the catalog contents drive the test, so let the opr_sanity addition stand and remove the now-redundant tests in conversion.sql. Also, remove some infrastructure in src/backend/utils/mb/conversion_procs for building conversion.sql's list of tests. That was unmaintained, and had not corresponded to the actual contents of conversion.sql since 2007 or perhaps even further back. http://git.postgresql.org/pg/commitdiff/419400c5da738d86c87e903a3d1924ff365bf203
  • Comment typo fix. Per Amit Langote. http://git.postgresql.org/pg/commitdiff/4bf87169cc1890442aa694f3057e0a0ad60c51f4
  • In initdb's post-bootstrap phase, drop temp tables explicitly. Although these temp tables will get removed from template1 at the end of the standalone-backend run, that's too late to keep them from getting copied into the template0 and postgres databases, now that we use only a single backend run for the whole sequence. While no real harm is done by the extra copies (since they'd be deleted on first use of the temp schema), it's still unsightly, and it would mean some wasted cycles for every database creation for the life of the installation. Oversight in commit c4a8812cf64b1426. Noticed by Amit Langote. http://git.postgresql.org/pg/commitdiff/dad08994b25b8cd2caa83b2e856fcc940d5e515c
  • Provide more detail in postmaster log for password authentication failures. We tell people to examine the postmaster log if they're unsure why they are getting auth failures, but actually only a few relatively-uncommon failure cases were given their own log detail messages in commit 64e43c59b817a78d. Expand on that so that every failure case detected within md5_crypt_verify gets a specific log detail message. This should cover pretty much every ordinary password auth failure cause. So far I've not noticed user demand for a similar level of auth detail for the other auth methods, but sooner or later somebody might want to work on them. This is not that patch, though. http://git.postgresql.org/pg/commitdiff/5e0b5dcab685fe2a342385450a29a825cf40cddf
  • Remove vestigial CHECK_FOR_INTERRUPTS call. Commit e710b65c inserted code in md5_crypt_verify to disable and later re-enable interrupts, with a CHECK_FOR_INTERRUPTS call as part of the second step, to process any interrupts that had been held off. Commit 6647248e removed the interrupt disable/re-enable code, but left behind the CHECK_FOR_INTERRUPTS, even though this is now an entirely random, pointless place for one. md5_crypt_verify doesn't run long enough to need such a check, and if it did, this would still be the wrong place to put one. http://git.postgresql.org/pg/commitdiff/6b1a837f69d00d265bee4b57ba2d320f1463f131
  • Use plain mkdir() not pg_mkdir_p() to create subdirectories of PGDATA. When we're creating subdirectories of PGDATA during initdb, we know darn well that the parent directory exists (or should exist) and that the new subdirectory doesn't (or shouldn't). There is therefore no need to use anything more complicated than mkdir(). Using pg_mkdir_p() just opens us up to unexpected failure modes, such as the one exhibited in bug #13853 from Nuri Boardman. It's not very clear why pg_mkdir_p() went wrong there, but it is clear that we didn't need to be trying to create parent directories in the first place. We're not even saving any code, as proven by the fact that this patch nets out at minus five lines. Since this is a response to a field bug report, back-patch to all branches. http://git.postgresql.org/pg/commitdiff/33b054bc797628e418e379badd38b00e4b523115
  • Fix unobvious interaction between -X switch and subdirectory creation. Turns out the only reason initdb -X worked is that pg_mkdir_p won't whine if you point it at something that's a symlink to a directory. Otherwise, the attempt to create pg_xlog/ just like all the other subdirectories would have failed. Let's be a little more explicit about what's happening. Oversight in my patch for bug #13853 (mea culpa for not testing -X ...) http://git.postgresql.org/pg/commitdiff/b41fb65056076b42d64a8690d61fd73dc648645b
  • Delay creation of subplan tlist until after create_plan(). Once upon a time it was necessary for grouping_planner() to determine the tlist it wanted from the scan/join plan subtree before it called query_planner(), because query_planner() would actually make a Plan using that. But we refactored things a long time ago to delay construction of the Plan tree till later, so there's no need to build that tlist until (and indeed unless) we're ready to plaster it onto the Plan. The only thing query_planner() cares about is what Vars are going to be needed for the tlist, and it can perfectly well get that by looking at the real tlist rather than some masticated version. Well, actually, there is one minor glitch in that argument, which is that make_subplanTargetList also adds Vars appearing only in HAVING to the tlist it produces. So now we have to account for HAVING explicitly in build_base_rel_tlists. But that just adds a few lines of code, and I doubt it moves the needle much on processing time; we might be doing pull_var_clause() twice on the havingQual, but before we had it scanning dummy tlist entries instead. This is a very small down payment on rationalizing grouping_planner enough so it can be refactored. http://git.postgresql.org/pg/commitdiff/c44d013835049053d19bc1795f0d169f3d1d6ff0
  • Marginal cleanup of GROUPING SETS code in grouping_planner(). Improve comments and make it a shade less messy. I think we might want to move all of this somewhere else later, but it needs to be more readable first. In passing, re-pgindent the file, affecting some recently-added comments concerning parallel query planning. http://git.postgresql.org/pg/commitdiff/a54676acadcf811f6945db15e81651df96beabc4
  • Add STRICT to some C functions created by the regression tests. These functions readily crash when passed a NULL input value. The tests themselves do not pass NULL values to them; but when the regression database is used as a basis for fuzz testing, they cause a lot of noise. Also, if someone were to leave a regression database lying about in a production installation, these would create a minor security hazard. Andreas Seltenreich http://git.postgresql.org/pg/commitdiff/529baf6a2f3fe85e7e6b4ad3ca38ed4ebffd6bb4
  • Clean up code for widget_in() and widget_out(). Given syntactically wrong input, widget_in() could call atof() with an indeterminate pointer argument, typically leading to a crash; or if it didn't do that, it might return a NULL pointer, which again would lead to a crash since old-style C functions aren't supposed to do things that way. Fix that by correcting the off-by-one syntax test and throwing a proper error rather than just returning NULL. Also, since widget_in and widget_out have been marked STRICT for a long time, their tests for null inputs are just dead code; remove 'em. In the oldest branches, also improve widget_out to use snprintf not sprintf, just to be sure. In passing, get rid of a long-since-useless sprintf into a local buffer that nothing further is done with, and make some other minor coding style cleanups. In the intended regression-testing usage of these functions, none of this is very significant; but if the regression test database were left around in a production installation, these bugs could amount to a minor security hazard. Piotr Stefaniak, Michael Paquier, and Tom Lane http://git.postgresql.org/pg/commitdiff/1cb63c791c7d070c1bb3cce58885c9697d769cd2
  • Clean up some lack-of-STRICT issues in the core code, too. A scan for missed proisstrict markings in the core code turned up these functions: brin_summarize_new_values pg_stat_reset_single_table_counters pg_stat_reset_single_function_counters pg_create_logical_replication_slot pg_create_physical_replication_slot pg_drop_replication_slot The first three of these take OID, so a null argument will normally look like a zero to them, resulting in "ERROR: could not open relation with OID 0" for brin_summarize_new_values, and no action for the pg_stat_reset_XXX functions. The other three will dump core on a null argument, though this is mitigated by the fact that they won't do so until after checking that the caller is superuser or has rolreplication privilege. In addition, the pg_logical_slot_get/peek[_binary]_changes family was intentionally marked nonstrict, but failed to make nullness checks on all the arguments; so again a null-pointer-dereference crash is possible but only for superusers and rolreplication users. Add the missing ARGISNULL checks to the latter functions, and mark the former functions as strict in pg_proc. Make that change in the back branches too, even though we can't force initdb there, just so that installations initdb'd in future won't have the issue. Since none of these bugs rise to the level of security issues (and indeed the pg_stat_reset_XXX functions hardly misbehave at all), it seems sufficient to do this. In addition, fix some order-of-operations oddities in the slot_get_changes family, mostly cosmetic, but not the part that moves the function's last few operations into the PG_TRY block. As it stood, there was significant risk for an error to exit without clearing historical information from the system caches. The slot_get_changes bugs go back to 9.4 where that code was introduced. Back-patch appropriate subsets of the pg_proc changes into all active branches, as well. http://git.postgresql.org/pg/commitdiff/26d538dc93543ed80c315b8313ea4dacd7309ff6
  • Add some checks on "char"-type columns to type_sanity and opr_sanity. I noticed that the sanity checks in the regression tests omitted to check a couple of "poor man's enum" columns that you'd reasonably expect them to check. There are other "char"-type columns in system catalogs that are not covered by either type_sanity or opr_sanity, e.g. pg_rewrite.ev_type. However, those catalogs are not populated with any manually-created data during bootstrap, so it seems less necessary to check them this way. http://git.postgresql.org/pg/commitdiff/3ef16c46fb3a64c150a3b42c3cc4a8538a12ff3f
  • Remove a useless PG_GETARG_DATUM() call from jsonb_build_array. This loop uselessly fetched the argument after the one it's currently looking at. No real harm is done since we couldn't possibly fetch off the end of memory, but it's confusing to the reader. Also remove a duplicate (and therefore confusing) PG_ARGISNULL check in jsonb_build_object. I happened to notice these things while trolling for missed null-arg checks earlier today. Back-patch to 9.5, not because there is any real bug, but just because 9.5 and HEAD are still in sync in this file and we might as well keep them so. In passing, re-pgindent. http://git.postgresql.org/pg/commitdiff/820bdccc1be22513e1aaa441d554992a5a2e314f

Robert Haas pushed:

Ãlvaro Herrera pushed:

  • Make pg_shseclabel available in early backend startup. While the in-core authentication mechanism doesn't need to access pg_shseclabel at all, it's reasonable to think that an authentication hook will want to look at the label for the role logging in, or for rows in other catalogs used during the authentication phase of startup. Catalog version bumped, because this changes the "is nailed" status for pg_shseclabel. Author: Adam Brightwell http://git.postgresql.org/pg/commitdiff/efa318bcfac132c48dff8196f726e56a6843f06b
  • Make pg_receivexlog silent with 9.3 and older servers. A pointless and confusing error message is shown to the user when attempting to identify a 9.3 or older remote server with a 9.5/9.6 pg_receivexlog, because the return signature of IDENTIFY_SYSTEM was changed in 9.4. There's no good reason for the warning message, so shuffle code around to keep it quiet. (pg_recvlogical is also affected by this commit, but since it obviously cannot work with 9.3 that doesn't actually matter much.) Backpatch to 9.5. Reported by Marco Nenciarini, who also wrote the initial patch. Further tweaked by Robert Haas and Fujii Masao; reviewed by Michael Paquier and Craig Ringer. http://git.postgresql.org/pg/commitdiff/4aecd22d3c84c44dd230426bcccd286798ac6b65
  • Add scale(numeric) Author: Marko Tiikkaja http://git.postgresql.org/pg/commitdiff/abb1733922f3ff17a514499883a549f8bd03af44
  • Windows: Make pg_ctl reliably detect service status. pg_ctl is using isatty() to verify whether the process is running in a terminal, and if not it sends its output to Windows' Event Log ... which does the wrong thing when the output has been redirected to a pipe, as reported in bug #13592. To fix, make pg_ctl use the code we already have to detect service-ness: in the master branch, move src/backend/port/win32/security.c to src/port (with suitable tweaks so that it runs properly in backend and frontend environments); pg_ctl already has access to pgport so it Just Works. In older branches, that's likely to cause trouble, so instead duplicate the required code in pg_ctl.c. Author: Michael Paquier Bug report and diagnosis: Egon Kocjan Backpatch: all supported branches http://git.postgresql.org/pg/commitdiff/a967613911f7ef7b6387b9e8718f0ab8f0c4d9c8
  • pgstat: add WAL receiver status view & SRF. This new view provides insight into the state of a running WAL receiver in a HOT standby node. The information returned includes the PID of the WAL receiver process, its status (stopped, starting, streaming, etc), start LSN and TLI, last received LSN and TLI, timestamp of last message send and receipt, latest end-of-WAL LSN and time, and the name of the slot (if any). Access to the detailed data is only granted to superusers; others only get the PID. Author: Michael Paquier Reviewer: Haribabu Kommi http://git.postgresql.org/pg/commitdiff/b1a9bad9e744857291c7d5516080527da8219854
  • Add win32security to LIBOBJS. This seems to fix Mingw's compile that was broken in a967613911f7e, as evidenced by buildfarm. http://git.postgresql.org/pg/commitdiff/fa838b555f90039ae5f0e6fb86ccae6a88b42703
  • Fix order of arguments to va_start() http://git.postgresql.org/pg/commitdiff/f81c966d2095fdab70a5d81ceb6dd9c89f4acd87
  • Blind attempt at a Cygwin fix. Further portability fix for a967613911f7. Mingw- and MSVC-based builds appear to be working fine, but Cygwin needs an extra tweak whereby the new win32security.c file is explicitely added to the list of files to build in pgport, per Cygwin members brolga and lorikeet. Author: Michael Paquier http://git.postgresql.org/pg/commitdiff/e9282e953205a2f3125fc8d1052bc01cb77cd2a3
  • Revert "Blind attempt at a Cygwin fix." This reverts commit e9282e953205a2f3125fc8d1052bc01cb77cd2a3, which blew up in a pretty spectacular way. Re-introduce the original code while we search for a real fix. http://git.postgresql.org/pg/commitdiff/463172116634423f8708ad9d7afb0f759a40cf2c

Tatsuo Ishii pushed:

Magnus Hagander pushed:

Simon Riggs pushed:

  • Avoid pin scan for replay of XLOG_BTREE_VACUUM. Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the “pin scan†that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here. The current commit still performs the pin scan for toast indexes, though this can also be avoided if we recheck scans on toast indexes. Later patch will address this. No tests included. Manual tests using an additional patch to view WAL records and their timing have shown the change in WAL records and their handling has successfully reduced replication delay. http://git.postgresql.org/pg/commitdiff/687f2cd7a0150647794efe432ae0397cb41b60ff
  • Revoke change to rmgr desc of btree vacuum. Per discussion with Andres Freund http://git.postgresql.org/pg/commitdiff/b6028426137532afae00188405fdecf7057b208c

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Ian Lawrence Barwick sent in a patch to update the description of vacuumdb to fit current realities.

Haribabu Kommi sent in four more revisions of a patch to implement multi-tenancy with RLS.

Pavel Stěhule sent in another revision of a patch to implement num_notnulls().

Pavel Stěhule sent in three more revisions of a patch to implement pg_size_bytes().

Michael Paquier sent in another revision of a patch to ensure that dynloader.h be present in Windows installations.

Amul Sul sent in a patch to ensure that columns in tables made inheritance tables by ALTER TABLE ... INHERIT are actually dropped when they're dropped in the parents.

Vitaly Burovoy sent in another revision of a patch to allow non-crazy extraction of fields from 'infinity'::TIMESTAMP[TZ].

Alexander Shulgin sent in a patch to fix an inconsistency in error handling for the START_REPLICATION command.

Petr Korobeinikov sent in a patch to add schema-qualified relnames in constraint error messages.

Alexander Shulgin sent in a patch to add an \errverbose option to psql to include schema-qualified names in error messages.

Amit Kapila sent in two more revisions of a patch to refactor LWLock tranches.

Andreas Karlsson sent in another revision of a patch to add COPY (statement) tab completion to psql.

Amit Kapila sent in another revision of a patch to rename pgproc variables.

Andreas Karlsson sent in another revision of a patch to improve tab completion in psql for FDW DDL.

Kyotaro HORIGUCHI sent in another revision of a patch to prepare for sharing psqlscan with pgbench, change the access method to shell variables, detach common.c from psqlscan, ensure that pgbench uses a common frontend SQL parser, and change the way to hold command list.

Ashutosh Bapat sent in another revision of a patch to enable getting sorted data from foreign server for merge joins.

Ashutosh Bapat sent in a patch to remove duplicate parameter deparser code from postgres_fdw.

Craig Ringer sent in another revision of a patch to implement pg_logical_output.

Marisa Emerson sent in two revisions of a patch to add support for BSD authentication, used on OpenBSD.

Etsuro Fujita sent in a patch to fix an issue with updating foreign tables.

Peter Geoghegan sent in another revision of a patch to generalize SortSupport for text.

Konstantin Knizhnik sent in two revisions of a patch to optimize LIMIT clauses.

Craig Ringer sent in another revision of a patch to implement pglogical.

David Rowley sent in another revision of a patch to implement combining aggregates.

Ashutosh Bapat sent in a patch to fix some infelicities in FDW join pushdown and scanclauses.

Ãlvaro Herrera sent in another revision of a patch to allow cataloging NOT NULL constraints.

Tomas Vondra sent in another revision of a patch to extend the HyperLogLog API.

Stas Kelvich and Simon Riggs traded patches to speed up two-phase transactions.

Peter Geoghegan sent in a patch to fix some misspellings in nodeGather.c and controldata.c.

Michael Paquier sent in another revision of a patch to fix some issues with Cygwin.

Peter Eisentraut sent in a patch to fix some odd bugs in psql's tab completion of CREATE INDEX.

Christian Ullrich sent in another revision of a patch to fix an SSPI auth bug.

Christian Ullrich sent in a patch to close a handle leak in SSPI auth.

par N Bougain le lundi 11 janvier 2016 à 22h51

vendredi 8 janvier 2016

Sébastien Lardière

Version 9.5 de PostgreSQL

Une nouvelle version majeure de PostgreSQL est disponible depuis le 7 janvier. Chacune des versions de PostgreSQL ajoute son lot de fonctionnalités, à la fois pour le développeur et l'administrateur. Cette version apporte de nombreuses fonctionnalités visant à améliorer les performances lors du requêtage de gros volumes de données.

Cette présentation en trois billets introduit trois types de fonctionnalités :

Ce premier billet revient donc sur les ajouts au langage SQL de la version 9.5 de PostgreSQL.Ces ajouts portent sur de nombreux champs de fonctionnalités, de la sécurité aux gestions de performances en passant par la gestion des transactions.UPSERTCe mot clé désigne en réalité la possibilité d'intercepter une erreur de clé primaire sur un ordre... Lire Version 9.5 de PostgreSQL

par Sébastien Lardière le vendredi 8 janvier 2016 à 10h49

jeudi 7 janvier 2016

Actualités PostgreSQL.fr

PostgreSQL 9.5 : UPSERT, Row Level Security et Big Data

Le 7 janvier 2016 : Le PostgreSQL Global Development Group annonce la publication de la version 9.5 de PostgreSQL. Cette version ajoute les capacités d'UPSERT, les droits de niveau ligne (Row Level Security), et de nombreuses fonctionnalités orientées Big Data, qui ouvrent plus grand encore les possibilités d'utilisation de PostgreSQL. Avec ces nouvelles fonctionnalités, PostgreSQL va devenir le choix par défaut pour un nombre encore plus grands d'applications, qu'il s'agisse de startups, de grandes entreprises ou d'agences gouvernementales.

Annie Prévot, DSI de la CNAF, déclare, « La CNAF sert 11 millions d'allocataires. Elle verse 73 milliards d'euros par an, au travers de 26 types de prestations. Ce service, essentiel à la population, s'appuie sur un système d'information qui se doit d'être absolument performant et fiable. Le système d'information de la CNAF s'appuie avec satisfaction sur le gestionnaire de Base de données PostgreSQL. En utilisation sur la moitié des systèmes, ce logiciel libre offre toutes les garanties de fonctionnement et de fiabilité. Il est en phase de déploiement sur l'ensemble des systèmes. »

UPSERT

UPSERT, raccourci de « INSERT, ON CONFLICT UPDATE », est une fonctionnalité réclamée depuis longtemps par les développeurs d'applications. Elle permet de traiter de la même façon l'ajout ou l'actualisation de lignes. UPSERT simplifie le développement des applications web et mobile en permettant de laisser la base gérer les conflits entre modifications concurrentes des données. Cette fonctionnalité supprime également la dernière barrière à la migration d'applications MySQL historiques vers PostgreSQL.

Développée sur les deux dernières années par Peter Geoghegan, programmeur au sein de la société Heroku, l'implantation de UPSERT au sein de PostgreSQL est nettement plus souple et puissante que celle de la plupart des autres SGBDR. La nouvelle clause ON CONFLICT permet d'ignorer la nouvelle donnée, ou d'actualiser différentes colonnes ou relations, de façon à supporter les chaînes d'ETL (Extract, Transform, LOAD) les plus complexes. Et, à l'instar de PostgreSQL, cette fonctionnalité a été conçue pour être totalement concurrentielle, et s'intégrer avec les autres fonctionnalités de PostgreSQL, dont la réplication logique.

Row Level Security

PostgreSQL continue d'étendre les possibilités de sécurisation des accès, avec la nouvelle fonctionnalité Row Level Security (RLS). RLS propose un vrai contrôle d'accès par ligne et par colonne, qui s'intègre avec les outils externe de sécurisation, tel SELinux. PostgreSQL est déjà connu comme « la base la plus sécurisée par défaut ». RLS conforte cette position de meilleur choix par défaut pour les applications à fort besoin de sécurisation, telles que la conformité PCI, la dIrective Européenne de protection des données, et les standards de protection des données des systèmes de santé.

RLS est l'aboutissement de 5 ans d'ajout de fonctionnalités de sécurité à PostgreSQL, ce qui inclut les travaux de KaiGai Kohei de NEC, Stephen Frost de Crunchy Data, et ceux de Dean Rasheed. À travers cette fonctionnalité, les administrateurs de base de données peuvent définir des politiques de sécurité, qui filtrent les lignes visibles en fonction des utilisateurs. La sécurisation mises en place par ce biais est résistante aux injections SQL et autres trous de sécurité de niveau applicatif.

Fonctionnalités Big Data

PostgreSQL 9.5 inclut de nombreuses nouvelles fonctionnalités pour les bases plus volumineuses, et l'intégration avec les systèmes Big Data. Ces fonctionnalités garantissent que PostgreSQL va continuer de jouer un rôle majeur dans le marché grandissant du Big Data en logiciels libres. Citons :

Les index BRIN : Ce nouveau type d'index permet la création d'index beaucoup plus petits, mais très efficaces, pour les tables volumineuses. Il est nécessaire que les données soient ordonnées. Par exemple, des tables contenant des millions de lignes de traces peuvent être indexées et interrogées en 5% du temps nécessaire lors de l'utilisation d'index de type BTree.

Tris plus rapides : PostgreSQL trie désormais les données textuelles et numériques plus rapidement, à l'aide d'un algorithme appelé "clés abrégées" ou "abbreviated keys". Cela accélère certaines requêtes qui nécessitent de trier des volumes considérables de données d'un facteur 2 à 12, et peut également accélérer la création d'index d'un facteur 20.

CUBE, ROLLUP et GROUPING SETS : Ces nouvelles clauses du standard permettent de produire des rapports avec plusieurs niveaux de synthèse en une seule requête. CUBE permet également une meilleure intégration de PostrgreSQL avec un nombre plus grand d'outils de création de rapports OLAP (Online Analytic Processing), tel Tableau.

Les Foreign Data Wrappers (FDW) : Ils permettent déjà d'utiliser PostgreSQL pour interroger des systèmes Big Data, tel Hadoop ou Cassandra. La version 9.5 ajoute IMPORT FOREIGN SCHEMA et les jointures au niveau des sources externes, ce qui rend les connexions aux données externes plus efficaces et plus faciles à établir.

TABLESAMPLE : Cette clause SQL permet de récupérer rapidement un échantillon statistique d'une table sans tri coûteux.

« Le nouvel index BRIN apparu dans PostgreSQL 9.5 est une fonctionnalité puissante qui permet à PostgreSQL de gérer et indexer des volumes de données difficiles, voire impossibles à traiter par le passé. Il permet de faire évoluer les données et les performances au delà de la limite considérée précédemment avec les bases relationnelles traditionnelles et fait de PostgreSQL une solution parfaite pour les analyses Big Data," déclare Boyan Botev, Lead Database Administrator, Premier, Inc.

Liens

par daamien le jeudi 7 janvier 2016 à 17h12

mercredi 6 janvier 2016

Actualités PostgreSQL.fr

PG Day France 2016 à Lille : Appel à Orateurs

Le PGDay France est la conférence annuelle de la communauté francophone de PostgreSQL. Cette année, l’événement se tiendra le 31 mai à Lille. Une centaine de visiteurs sont attendus pour une journée d'échanges autour de PostgreSQL et de ses projets associés. Retrouvez plus d'informations sur le site de l’événement : http://www.pgday.fr

Vous êtes expert sur un domaine lié aux bases de données libres ? Vous avez utilisé PostgreSQL dans un contexte spécifique (gros volumes, forte charge, client reconnu, projet innovant, etc.) ? Vous participez à un projet libre lié à PostgreSQL ? Alors n'hésitez pas à proposer une présentation !

Pour l’édition 2016, les thèmes particulièrement mis en lumière sont les suivants :

  • Administration de bases volumineuses
  • Études de cas / témoignages
  • Industrialisation
  • Entrepôts de données
  • Systèmes décisionnels
  • Travaux sur la sémantique
  • Big Data
  • Data Mining / Exploration de Données
  • Systèmes d'Information Géographiques

Cette liste n'est pas exhaustive. Il est possible de proposer d'autres sujets liés à PostgreSQL.

La conférence PGDay France est à destination des professionnels, notamment les directeurs informatiques, les décideurs, les chefs de projets, les administrateurs de bases de données, les développeurs, les administrateurs systèmes et tous les profils qui entrent en contact avec un SGBD.

Pour soumettre une intervention, il vous suffit d'envoyer un e-mail à l'adresse contact@pgday.fr, en précisant les éléments suivants :

  • vos nom et prénom
  • votre société/employeur
  • une bio succincte (300 caractères max.)
  • votre compte twitter (optionnel)
  • le titre de votre intervention
  • la durée de votre intervention (45 min. max.)
  • une description courte (200 caractères max.)
  • une description longue (700 caractères max.)
  • une photo (200x200 pixels minimum)

Les interventions devront être en français et disponibles sous licence libre. Les interventions pourront faire l'objet d'une captation audio/vidéo et d'une diffusion sur internet.

La date limite de réception des propositions est fixée au 14 février 2016 à 23h59 CEST.

Dans le courant du mois de février 2016, un sondage communautaire sera organisé au sein de la communauté francophone pour évaluer les différentes propositions.

Ensuite, le comité de sélection étudiera toutes les propositions valides. Le choix des sessions sera basé sur la présentation de la soumission, son intérêt pour une audience professionnelle, la cohérence du programme de la journée et sur le résultat du vote préliminaire.

La décision du comité de sélection sera finale et sans appel.

Les membres du comité s'expriment en leur nom propre. Leurs choix ne reflètent pas la position de leur employeur.

Les orateurs sélectionnés seront avertis par e-mail avant le 7 mars 2016.

Pour toute question à propos de cet appel à conférenciers et du PGDay France en général, vous pouvez envoyer un message à l'adresse : contact@pgday.fr

par daamien le mercredi 6 janvier 2016 à 18h03

mardi 5 janvier 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 3 janvier 2016

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en janvier

PostgreSQL Local

  • PostgreSQL@SCaLE est un événement de deux jours avec deux programmes qui aura lieu les 21 et 22 janvier 2016 au Pasadena Convention Center à l'occasion du SCaLE 14X. https://www.socallinuxexpo.org/scale/14x/cfp
  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Détails et appel à conférenciers ci-après : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016. L'appel à conférenciers est lancé : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour. L'appel à conférenciers est ouvert : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. L'appel à conférenciers est lancé : http://2016.nordicpgday.org/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa. L'appel à conférenciers a été lancé : http://www.pgcon.org/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20160104004346.GH30007@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Ãlvaro Herrera pushed:

Tom Lane pushed:

  • Update documentation about pseudo-types. Tone down an overly strong statement about which pseudo-types PLs are likely to allow. Add "event_trigger" to the list, as well as "pg_ddl_command" in 9.5/HEAD. Back-patch to 9.3 where event_trigger was added. http://git.postgresql.org/pg/commitdiff/731dfc7d5f07fac2c3c72f47c29a947e363acee9
  • Fix omission of -X (--no-psqlrc) in some psql invocations. As of commit d5563d7df, psql -c no longer implies -X, but not all of our regression testing scripts had gotten that memo. To ensure consistency of results across different developers, make sure that *all* invocations of psql in all scripts in our tree use -X, even where this is not what previously happened. Michael Paquier and Tom Lane http://git.postgresql.org/pg/commitdiff/870df2b3b77414a536d6533566628f11f8f309ec
  • Document the exponentiation operator as associating left to right. Common mathematical convention is that exponentiation associates right to left. We aren't going to change the parser for this, but we could note it in the operator's description. (It's already noted in the operator precedence/associativity table, but users might not look there.) Per bug #13829 from Henrik Pauli. http://git.postgresql.org/pg/commitdiff/54aaafe95f65c95fd9ba085826af87d778c94613
  • Code and docs review for cube kNN support. Commit 33bd250f6c4cc309f4eeb657da80f1e7743b3e5c could have done with some more review: Adjust coding so that compilers unfamiliar with elog/ereport don't complain about uninitialized values. Fix misuse of PG_GETARG_INT16 to retrieve arguments declared as "integer" at the SQL level. (This was evidently copied from cube_ll_coord and cube_ur_coord, but those were wrong too.) Fix non-style-guide-conforming error messages. Fix underparenthesized if statements, which pgindent would have made a hash of, and remove some unnecessary parens elsewhere. Run pgindent over new code. Revise documentation: repeated accretion of more operators without any rethinking of the text already there had left things in a bit of a mess. Merge all the cube operators into one table and adjust surrounding text appropriately. David Rowley and Tom Lane http://git.postgresql.org/pg/commitdiff/81ee726d87ec67c4f2846110c99f72e8a20dcd07
  • Put back one copyObject() in rewriteTargetView(). Commit 6f8cb1e23485bd6d tried to centralize rewriteTargetView's copying of a target view's Query struct. However, it ignored the fact that the jointree->quals field was used twice. This only accidentally failed to fail immediately because the same ChangeVarNodes mutation is applied in both cases, so that we end up with logically identical expression trees for both uses (and, as the code stands, the second ChangeVarNodes call actually does nothing). However, we end up linking *physically* identical expression trees into both an RTE's securityQuals list and the WithCheckOption list. That's pretty dangerous, mainly because prepsecurity.c is utterly cavalier about further munging such structures without copying them first. There may be no live bug in HEAD as a consequence of the fact that we apply preprocess_expression in between here and prepsecurity.c, and that will make a copy of the tree anyway. Or it may just be that the regression tests happen to not trip over it. (I noticed this only because things fell over pretty badly when I tried to relocate the planner's call of expand_security_quals to before expression preprocessing.) In any case it's very fragile because if anyone tried to make the securityQuals and WithCheckOption trees diverge before we reach preprocess_expression, it would not work. The fact that the current code will preprocess securityQuals and WithCheckOptions lists at completely different times in different query levels does nothing to increase my trust that that can't happen. In view of the fact that 9.5.0 is almost upon us and the aforesaid commit has seen exactly zero field testing, the prudent course is to make an extra copy of the quals so that the behavior is not different from what has been in the field during beta. http://git.postgresql.org/pg/commitdiff/fd1952575618cacf7afa544d8b89ddb77be9eaee
  • Add some comments about division of labor between rewriter and planner. The rationale for the way targetlist processing is done wasn't clearly stated anywhere, and I for one had forgotten some of the details. Having just painfully re-learned them, add some breadcrumbs for the next person. http://git.postgresql.org/pg/commitdiff/efe4c9d7049f0bf832b792bfad05c92aaf86aa3c
  • Minor hacking on contrib/cube documentation. Improve markup, particularly of the table of functions; add or improve examples for some of the functions; wordsmith some of the function descriptions. http://git.postgresql.org/pg/commitdiff/e5e5267a91f4880c121bf50865cbc38078441989
  • Avoid useless truncation attempts during VACUUM. VACUUM can skip heap pages altogether when there's a run of consecutive pages that are all-visible according to the visibility map. This causes it to not update its nonempty_pages count, just as if those pages were empty, which means that at the end we will think they are candidates for deletion. Thus, we may take the table's AccessExclusive lock only to find that no pages are really truncatable. This usually causes no real problems on a master server, thanks to the lock being acquired only conditionally; but on hot-standby servers, the same lock must be acquired unconditionally which can result in unnecessary query cancellations. To improve matters, force examination of the table's last page whenever we reach there with a nonempty_pages count that would allow a truncation attempt. If it's not empty, we'll advance nonempty_pages and thereby prevent the truncation attempt. If we are unable to acquire cleanup lock on that page, there's no need to force it, unless we're doing an anti-wraparound vacuum. We can just check for tuples with a shared buffer lock and then give up. (When we are doing an anti-wraparound vacuum, and decide it's okay to skip the page because it contains no freezable tuples, this patch still improves matters because nonempty_pages is properly updated, which it was not before.) Since only the last page is special-cased in this way, we might attempt a truncation that will release many fewer pages than the normal heuristic would suggest; at worst, only one page would be truncated. But that seems all right, because the situation won't repeat during the next vacuum. The real problem with the old logic is that the useless truncation attempt happens every time we vacuum, so long as the state of the last few dozen pages doesn't change. This is a longstanding deficiency, but since the consequences aren't very severe in most scenarios, I'm not going to risk a back-patch. Jeff Janes and Tom Lane http://git.postgresql.org/pg/commitdiff/e842908233bb9c5cea0e765fc828b52badd8228e
  • Dept of second thoughts: the !scan_all exit mustn't increase scanned_pages. In the extreme edge case where contended pages are the only ones that escape being scanned, the previous commit would have allowed us to think that relfrozenxid could be advanced, which is exactly wrong. http://git.postgresql.org/pg/commitdiff/e5d06f2b12a7c75f2b0c7fd2055a14efaa2b59ec
  • Fix ALTER OPERATOR to update dependencies properly. Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's selectivity functions needs to result in a corresponding update in pg_depend. We have a function that can handle that, but it was not called by AlterOperator(). To fix this without enlarging pg_operator.h's #include list beyond what clients can safely include, split off the function definitions into a new file pg_operator_fn.h, similarly to what we've done for some other catalog header files. It's not entirely clear whether any client-side code needs to include pg_operator.h, but it seems prudent to assume that there is some such code somewhere. http://git.postgresql.org/pg/commitdiff/0dab5ef39b3d9d86e45bbbb2f6ea60b4f5517d9a
  • Add a comment noting that FDWs don't have to implement EXCEPT or LIMIT TO. postgresImportForeignSchema pays attention to IMPORT's EXCEPT and LIMIT TO options, but only as an efficiency hack, not for correctness' sake. The FDW documentation does explain that, but someone using postgres_fdw.c as a coding guide might not remember it, so let's add a comment here. Per question from Regina Obe. http://git.postgresql.org/pg/commitdiff/5f36096b77fe47015cbac130d1a20d089f202a1e
  • Add some more defenses against silly estimates to gincostestimate(). A report from Andy Colson showed that gincostestimate() was not being nearly paranoid enough about whether to believe the statistics it finds in the index metapage. The problem is that the metapage stats (other than the pending-pages count) are only updated by VACUUM, and in the worst case could still reflect the index's original empty state even when it has grown to many entries. We attempted to deal with that by scaling up the stats to match the current index size, but if nEntries is zero then scaling it up still gives zero. Moreover, the proportion of pages that are entry pages vs. data pages vs. pending pages is unlikely to be estimated very well by scaling if the index is now orders of magnitude larger than before. We can improve matters by expanding the use of the rule-of-thumb estimates I introduced in commit 7fb008c5ee59b040: if the index has grown by more than a cutoff amount (here set at 4X growth) since VACUUM, then use the rule-of-thumb numbers instead of scaling. This might not be exactly right but it seems much less likely to produce insane estimates. I also improved both the scaling estimate and the rule-of-thumb estimate to account for numPendingPages, since it's reasonable to expect that that is accurate in any case, and certainly pages that are in the pending list are not either entry or data pages. As a somewhat separate issue, adjust the estimation equations that are concerned with extra fetches for partial-match searches. These equations suppose that a fraction partialEntries / numEntries of the entry and data pages will be visited as a consequence of a partial-match search. Now, it's physically impossible for that fraction to exceed one, but our estimate of partialEntries is mostly bunk, and our estimate of numEntries isn't exactly gospel either, so we could arrive at a silly value. In the example presented by Andy we were coming out with a value of 100, leading to insane cost estimates. Clamp the fraction to one to avoid that. Like the previous patch, back-patch to all supported branches; this problem can be demonstrated in one form or another in all of them. http://git.postgresql.org/pg/commitdiff/3c93a60f6019768f5742b7893a93db93fb67e71f
  • Teach flatten_reloptions() to quote option values safely. flatten_reloptions() supposed that it didn't really need to do anything beyond inserting commas between reloption array elements. However, in principle the value of a reloption could be nearly anything, since the grammar allows a quoted string there. Any restrictions on it would come from validity checking appropriate to the particular option, if any. A reloption value that isn't a simple identifier or number could thus lead to dump/reload failures due to syntax errors in CREATE statements issued by pg_dump. We've gotten away with not worrying about this so far with the core-supported reloptions, but extensions might allow reloption values that cause trouble, as in bug #13840 from Kouhei Sutou. To fix, split the reloption array elements explicitly, and then convert any value that doesn't look like a safe identifier to a string literal. (The details of the quoting rule could be debated, but this way is safe and requires little code.) While we're at it, also quote reloption names if they're not safe identifiers; that may not be a likely problem in the field, but we might as well try to be bulletproof here. It's been like this for a long time, so back-patch to all supported branches. Kouhei Sutou, adjusted some by me http://git.postgresql.org/pg/commitdiff/c7e27becd2e6eb93b20965b9f22701eaad42a764
  • Update copyright for 2016 Manually fix some copyright lines missed by the automated script. http://git.postgresql.org/pg/commitdiff/ad08bf5c8b96c2a3a70d96f5be1c04cb83b4ed6e
  • Update copyright for 2016 On closer inspection, the reason copyright.pl was missing files is that it is looking for 'Copyright (c)' and they had 'Copyright (C)'. Fix that, and update a couple more that grepping for that revealed. http://git.postgresql.org/pg/commitdiff/48c9f2889a4ad25a771d13b88f2778a306f2d970
  • Make copyright.pl cope with nonstandard case choices in copyright notices. The need for this is shown by the files it missed in Bruce's recent run. I fixed it so that it will actually adjust the case when needed. In passing, also make it skip .po files, since those will just get overwritten anyway from the translation repository. http://git.postgresql.org/pg/commitdiff/de7c8dbea1a17a0e1709c4b12371615d28e21c13
  • Adjust back-branch release note description of commits a2a718b22 et al. As pointed out by Michael Paquier, recovery_min_apply_delay didn't exist in 9.0-9.3, making the release note text not very useful. Instead make it talk about recovery_target_xid, which did exist then. 9.0 is already out of support, but we can fix the text in the newer branches' copies of its release notes. http://git.postgresql.org/pg/commitdiff/df35af2ca7b5545d32b978a88b665bac2b9fa638
  • Fix overly-strict assertions in spgtextproc.c. spg_text_inner_consistent is capable of reconstructing an empty string to pass down to the next index level; this happens if we have an empty string coming in, no prefix, and a dummy node label. (In practice, what is needed to trigger that is insertion of a whole bunch of empty-string values.) Then, we will arrive at the next level with in->level == 0 and a non-NULL (but zero length) in->reconstructedValue, which is valid but the Assert tests weren't expecting it. Per report from Andreas Seltenreich. This has no impact in non-Assert builds, so should not be a problem in production, but back-patch to all affected branches anyway. In passing, remove a couple of useless variable initializations and shorten the code by not duplicating DatumGetPointer() calls. http://git.postgresql.org/pg/commitdiff/7157fe80f42476db249e062b4f6eef6a3994b234
  • Teach pg_dump to quote reloption values safely. Commit c7e27becd2e6eb93 fixed this on the backend side, but we neglected the fact that several code paths in pg_dump were printing reloptions values that had not gotten massaged by ruleutils. Apply essentially the same quoting logic in those places, too. http://git.postgresql.org/pg/commitdiff/b416c0bb622ce5d33fdbec3bbce00451132f10ec
  • Fix treatment of *lpNumberOfBytesRecvd == 0: that's a completion condition. pgwin32_recv() has treated a non-error return of zero bytes from WSARecv() as being a reason to block ever since the current implementation was introduced in commit a4c40f140d23cefb. However, so far as one can tell from Microsoft's documentation, that is just wrong: what it means is graceful connection closure (in stream protocols) or receipt of a zero-length message (in message protocols), and neither case should result in blocking here. The only reason the code worked at all was that control then fell into the retry loop, which did *not* treat zero bytes specially, so we'd get out after only wasting some cycles. But as of 9.5 we do not normally reach the retry loop and so the bug is exposed, as reported by Shay Rojansky and diagnosed by Andres Freund. Remove the unnecessary test on the byte count, and rearrange the code in the retry loop so that it looks identical to the initial sequence. Back-patch to 9.5. The code is wrong all the way back, AFAICS, but since it's relatively harmless in earlier branches we'll leave it alone. http://git.postgresql.org/pg/commitdiff/90e61df8130dc7051a108ada1219fb0680cb3eb6
  • Do a final round of copy-editing on the 9.5 release notes. http://git.postgresql.org/pg/commitdiff/027989197aab9e555328721b003ebd1839a16704
  • Do some copy-editing on the docs for replication origins. Minor grammar and markup improvements. http://git.postgresql.org/pg/commitdiff/c6aeba353a15d71f584488a7482fb88337f843e3
  • Guard against null arguments in binary_upgrade_create_empty_extension(). The CHECK_IS_BINARY_UPGRADE macro is not sufficient security protection if we're going to dereference pass-by-reference arguments before it. But in any case we really need to explicitly check PG_ARGISNULL for all the arguments of a non-strict function, not only the ones we expect null values for. Oversight in commits 30982be4e5019684e1772dd9170aaa53f5a8e894 and f92fc4c95ddcc25978354a8248d3df22269201bc. Found by Andreas Seltenreich. (The other usages in pg_upgrade_support.c seem safe.) http://git.postgresql.org/pg/commitdiff/939d10cd8711fdeb7f0ff62c9c6b08e3eddbba3e

Joe Conway pushed:

  • Rename (new|old)estCommitTs to (new|old)estCommitTsXid. The variables newestCommitTs and oldestCommitTs sound as if they are timestamps, but in fact they are the transaction Ids that correspond to the newest and oldest timestamps rather than the actual timestamps. Rename these variables to reflect that they are actually xids: to wit newestCommitTsXid and oldestCommitTsXid respectively. Also modify related code in a similar fashion, particularly the user facing output emitted by pg_controldata and pg_resetxlog. Complaint and patch by me, review by Tom Lane and Alvaro Herrera. Backpatch to 9.5 where these variables were first introduced. http://git.postgresql.org/pg/commitdiff/241448b23adf3432988f2b4012ff90a338b4d0bf

Peter Eisentraut pushed:

Noah Misch pushed:

  • Fix comments about WAL rule "write xlog before data" versus pg_multixact. Recovery does not achieve its goal of zeroing all pg_multixact entries whose accompanying WAL records never reached disk. Remove that claim and justify its expendability. Detail the need for TrimMultiXact(), which has little in common with the TrimCLOG() rationale. Merge two tightly-related comments. Stop presenting pg_multixact as specific to heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update(). Noticed while investigating a report from Andres Freund. http://git.postgresql.org/pg/commitdiff/3cd1ba147e5619199914e5b71e0edbd188a763d2
  • Cover heap_page_prune_opt()'s cleanup lock tactic in README. Jeff Janes, reviewed by Jim Nasby. http://git.postgresql.org/pg/commitdiff/dfcd9cb30237f882b7308bdcbfb0318b22b1e224

Bruce Momjian pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Peter Geoghegan sent in another revision of a patch to use quicksort when performing external sorts.

Jeff Janes sent in another revision of a patch to add a gin_clean_pending_list() function.

David Rowley sent in two more revisions of a patch to fix some typos.

Grzegorz Sampolski sent in a patch to add rhost as an option for PAM auth.

Haribabu Kommi and Alexander Shulgin traded patches to add a pg_hba_lookup() function.

Dmitry Ivanov sent in a patch to add an array_elemtype() function.

Aleksander Alekseev sent in three more revisions of a patch to fix lock contention for HASHHDR.mutex.

Andreas Karlsson sent in a patch to implement tab completion in psql for COPY with a query.

Joe Conway sent in another revision of a patch to expose pg_controldata and pg_config as functions.

Haribabu Kommi sent in another revision of a patch to implement multi-tenancy with RLS.

Thomas Munro sent in another revision of a patch to implement causal reads.

Tom Lane sent in a patch to better detail logging for password auth failures.

Michael Paquier sent in three revisions of a patch to add support for a case-sensitive comparison facility to psql's tab completion.

Bruce Momjian sent in a patch to ensure that the correct .h files are installed on Windows.

Amit Kapila sent in two more revisions of a patch to refactor LWLock tranches.

Andreas Karlsson sent in a patch to improve psql's tab completion for FDW DDL.

Stas Kelvich sent in another revision of a patch to add Tsvector editing functions.

Pavel Stěhule sent in another revision of a patch to add a custom function for converting human-readable sizes to bytes.

Ãlvaro Herrera sent in another revision of a patch to implement column stores.

Corey Huinker sent in another revision of a patch to allow FETCH to be limited by a specification measured in bytes rather than rows.

Tom Lane sent in a patch to improve the rows estimate for BRIN indexes.

Pavel Stěhule sent in another revision of a patch to make a PL/PythonU version of ereport().

Dilip Kumar sent in another revision of a patch to scale relation extensibility.

David Steele sent in a patch to add a pg_audit extension.

Tomas Vondra sent in another revision of a patch to postpone building buckets to the end of Hash (in HashJoin).

Tomas Vondra sent in another revision of a patch to add a Bloom filter in Hash Joins with batches.

Tomas Vondra sent in a patch to extend the hyperloglog API by adding initHyperLogLogError() and freeHyperLogLog().

Petr Jelínek sent in a patch to add pglogical, a logical replication contrib module.

Petr Jelínek sent in another revision of a patch to add a sequence access method.

Petr Jelínek sent in a patch to fix a copy-paste error in the logical decoding docs.

Petr Jelínek sent in a patch to implement generic WAL logical messages.

Simon Riggs sent in a WIP patch to implement failover slots.

Michael Paquier sent in another revision of a patch to fix how pg_dump locks tables.

Jim Nasby sent in two revisions of a patch to improve error reporting in format().

Amit Kapila and Andres Freund traded patches to fix an issue with backend processes not terminating properly.

SAWADA Masahiko sent in another revision of a patch to support N synchronous standby servers for N>1.

Simon Riggs sent in another revision of a patch to avoid standby pin scans.

Andreas Karlsson sent in two approaches to a patch to fix some infelicities between psql's \x auto feature and EXPLAIN.

Pavel Stěhule sent in another revision of a patch to implement a num_nulls() function.

par N Bougain le mardi 5 janvier 2016 à 21h45

vendredi 1 janvier 2016

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 27 décembre 2015

Les nouveautés des produits dérivés

Offres d'emplois autour de PostgreSQL en décembre

PostgreSQL Local

  • PostgreSQL@SCaLE est un événement de deux jours avec deux programmes qui aura lieu les 21 et 22 janvier 2016 au Pasadena Convention Center à l'occasion du SCaLE 14X. https://www.socallinuxexpo.org/scale/14x/cfp
  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Détails et appel à conférenciers ci-après : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016. L'appel à conférenciers est lancé : http://pgday.in
  • La première conférence PostgreSQL pan-asiatique se tiendra les 16 et 17 mars 2016 à Singapour. L'appel à conférenciers est ouvert : http://2016.pgday.asia/
  • Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. L'appel à conférenciers est lancé : http://2016.nordicpgday.org/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa. L'appel à conférenciers a été lancé : http://www.pgcon.org/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20151228022000.GA6856@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Stephen Frost pushed:

  • Make viewquery a copy in rewriteTargetView() Rather than expect the Query returned by get_view_query() to be read-only and then copy bits and pieces of it out, simply copy the entire structure when we get it. This addresses an issue where AcquireRewriteLocks, which is called by acquireLocksOnSubLinks(), scribbles on the parsetree passed in, which was actually an entry in relcache, leading to segfaults with certain view definitions. This also future-proofs us a bit for anyone adding more code to this path. The acquireLocksOnSubLinks() was added in commit c3e0ddd40. Back-patch to 9.3 as that commit was. http://git.postgresql.org/pg/commitdiff/6f8cb1e23485bd6d45e8865760436e1a5ce65a6d

Tom Lane pushed:

  • Fix calculation of space needed for parsed words in tab completion. Yesterday in commit d854118c8, I had a serious brain fade leading me to underestimate the number of words that the tab-completion logic could divide a line into. On input such as "(((((", each character will get seen as a separate word, which means we do indeed sometimes need more space for the words than for the original line. Fix that. http://git.postgresql.org/pg/commitdiff/f5a4370aea3580f5f7f59a77e41fde62f2be12d8
  • Allow omitting one or both boundaries in an array slice specifier. Omitted boundaries represent the upper or lower limit of the corresponding array subscript. This allows simpler specification of many common use-cases. (Revised version of commit 9246af6799819847faa33baf441251003acbb8fe) YUriy Zhuravlev http://git.postgresql.org/pg/commitdiff/6efbded6e4672c597a6f0dc0f09263e7db7369ff
  • In pg_dump, remember connection passwords no matter how we got them. When pg_dump prompts the user for a password, it remembers the password for possible re-use by parallel worker processes. However, libpq might have extracted the password from a connection string originally passed as "dbname". Since we don't record the original form of dbname but break it down to host/port/etc, the password gets lost. Fix that by retrieving the actual password from the PGconn. (It strikes me that this whole approach is rather broken, as it will also lose other information such as options that might have been present in the connection string. But we'll leave that problem for another day.) In passing, get rid of rather silly use of malloc() for small fixed-size arrays. Back-patch to 9.3 where parallel pg_dump was introduced. Report and fix by Zeus Kronion, adjusted a bit by Michael Paquier and me http://git.postgresql.org/pg/commitdiff/1aa41e3eae3746e05d0e23286ac740a9a6cee7df
  • Improve handling of password reuse in src/bin/scripts programs. This reverts most of commit 83dec5a71 in favor of having connectDatabase() store the possibly-reusable password in a static variable, similar to the coding we've had for a long time in pg_dump's version of that function. To avoid possible problems with unwanted password reuse, make callers specify whether it's reasonable to attempt to re-use the password. This is a wash for cases where re-use isn't needed, but it is far simpler for callers that do want that. Functionally there should be no difference. Even though we're past RC1, it seems like a good idea to back-patch this into 9.5, like the prior commit. Otherwise, if there are any third-party users of connectDatabase(), they'll have to deal with an API change in 9.5 and then another one in 9.6. Michael Paquier http://git.postgresql.org/pg/commitdiff/ff402ae11b4d33e0e46c2730f63033d3631b8010
  • Avoid VACUUM FULL altogether in initdb. Commit ed7b3b3811c5836a purported to remove initdb's use of VACUUM FULL, as had been agreed to in a pghackers discussion back in Dec 2014. But it missed this one ... http://git.postgresql.org/pg/commitdiff/01e386a325549b7755739f31308de4be8eea110d
  • Fix factual and grammatical errors in comments for struct _tableInfo. Amit Langote, further adjusted by me http://git.postgresql.org/pg/commitdiff/96cd61a16958d3a64da697c3ef31eee5e10141a0
  • Docs typo fix. Michael Paquier http://git.postgresql.org/pg/commitdiff/bee172fcd586bccd3a3ba067592d639b7600aa04
  • Docs: fix erroneously-given function name. pg_replication_session_is_setup() exists nowhere; apparently this is meant to refer to pg_replication_origin_session_is_setup(). Adrien Nayrat http://git.postgresql.org/pg/commitdiff/71dd092c0177af14a00bbb18a8aebbed0d389f05
  • Remove unnecessary row ordering dependency in pg_rewind test suite. t/002_databases.pl was expecting to see a specific physical order of the rows in pg_database. I broke that in HEAD with commit 01e386a325549b77, but I'd say it's a pretty fragile test methodology in any case, so fix it in 9.5 as well. http://git.postgresql.org/pg/commitdiff/a9246fbf665327870370d1088bfc9efdfd2719ec
  • Fix brin_summarize_new_values() to check index type and ownership. brin_summarize_new_values() did not check that the passed OID was for an index at all, much less that it was a BRIN index, and would fail in obscure ways if it wasn't (possibly damaging data first?). It also lacked any permissions test; by analogy to VACUUM, we should only allow the table's owner to summarize. Noted by Jeff Janes, fix by Michael Paquier and me http://git.postgresql.org/pg/commitdiff/3d2b31e30e2931b3edb5ab9d0eafca13e7bcffe5
  • Include typmod when complaining about inherited column type mismatches. MergeAttributes() rejects cases where columns to be merged have the same type but different typmod, which is correct; but the error message it printed didn't show either typmod, which is unhelpful. Changing this requires using format_type_with_typemod() in place of TypeNameToString(), which will have some minor side effects on the way some type names are printed, but on balance this is an improvement: the old code sometimes printed one type according to one set of rules and the other type according to the other set, which could be confusing in its own way. Oddly, there were no regression test cases covering any of this behavior, so add some. Complaint and fix by Amit Langote http://git.postgresql.org/pg/commitdiff/fec1ad94dfc5ddacfda8d249bf4b3c739da8f7a1

Robert Haas pushed:

  • postgres_fdw: Consider requesting sorted data so we can do a merge join. When use_remote_estimate is enabled, consider adding ORDER BY to the query we sending to the remote server so that we can use that ordered data for a merge join. Commit f18c944b6137329ac4a6b2dce5745c5dc21a8578 arranges to push down the query pathkeys, which seems like the case mostly likely to be a win, but testing shows this can sometimes win, too. For a regular table, we know which indexes are present and therefore test whether the ordering provided by each such index is useful. Here, we take the opposite approach: guess what orderings would be useful if they could be generated cheaply, and then ask the remote side what those will cost. Ashutosh Bapat, with very substantial cosmetic revisions by me. Also reviewed by Rushabh Lathia. http://git.postgresql.org/pg/commitdiff/ccd8f97922944566d26c7d90eb67ab7848ee9905
  • Comment improvements for abbreviated keys. Peter Geoghegan and Robert Haas http://git.postgresql.org/pg/commitdiff/0ba3f3bc65f1176250b942e14fd9e4975a5d3913
  • Change Gather not to use a physical tlist. This should have been part of the original commit, but was missed. Pushing data between processes is expensive, so we definitely want to project away unneeded columns here, just as we do for other nodes like Sort and Hash that care about the volume of data. http://git.postgresql.org/pg/commitdiff/51d152f18e124cc07c293756cc16014ba218b2df
  • Read from the same worker repeatedly until it returns no tuple. The original coding read tuples from workers in round-robin fashion, but performance testing shows that it works much better to read enough to empty one queue before moving on to the next. I believe the reason for this is that, with the old approach, we could easily wake up a worker repeatedly to write only one new tuple into the shm_mq each time. With this approach, by the time the process gets scheduled, it has a decent chance of being able to fill the entire buffer in one go. Patch by me. Dilip Kumar helped with performance testing. http://git.postgresql.org/pg/commitdiff/bc7fcab5e36b9597857fa7e3fa6d9ba54aaea167

Peter Eisentraut pushed:

Teodor Sigaev pushed:

Fujii Masao pushed:

Ãlvaro Herrera pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Michael Paquier sent in another revision of a patch to fix bug #13685.

Michael Paquier sent in another revision of a patch to add in-core regression tests for replication, cascading, archiving, PITR, etc.

Haribabu Kommi sent in another revision of a patch to implement parallel aggregation.

Amit Kapila sent in another revision of a patch to speed up clog access by increasing CLOG buffers.

Simon Riggs sent in a patch to avoid the "pin scan" on standbys.

Victor Wagner sent in another revision of a patch to implement failover on libpq connect level.

Artur Zakirov and Pavel Stěhule traded patches to fix %TYPE and add %ARRAYTYPE and %ELEMENTTYPE to PL/pgsql.

Stephen Frost sent in two more revisions of a patch to add a note regarding permissions in pg_catalog, reserve the "pg_" namespace for roles, and create default roles.

Etsuro Fujita sent in two more revisions of a patch to optimize writes to the PostgreSQL FDW.

Kyotaro HORIGUCHI sent in another revision of a patch to fix some issues in psql's tab completion for CREATE/DROP INDEX.

Pavel Stěhule sent in another revision of a patch to add a pg_size_bytes() function.

Noah Misch sent in another revision of a patch to speed up writing out stats.

Aleksander Alekseev sent in another revision of a patch to fix lock contention for HASHHDR.mutex.

YUriy Zhuravlev sent in another revision of a patch to extend the array slice syntax.

Tomas Vondra sent in another revision of a patch to add multivariate statistics.

Robert Haas sent in another revision of a patch to add parallel joins.

Haribabu Kommi sent in another revision of a patch to add a pg_hba_lookup function to get all matching pg_hba.conf entries.

Peter Geoghegan sent in a patch to test a maximum order for external sort runs.

Michael Paquier sent in a patch to fix a typo in the pg_rewind docs.

David Rowley sent in another revision of a patch to make it possible to combine aggregates. This is infrastructure for, among other things, making these easier to parallelize and shard.

Ildus Kurbangaliev sent in three more revisions of a patch to add GiST support for UUIDs.

Daniel Verité sent in another revision of a patch to add a \crosstabview command to psql.

Aleksander Alekseev sent in two more revisions of a patch to improve performance of tables with many partitions.

Ildus Kurbangaliev sent in another revision of a patch to refactor lwlock tranches.

Alexander Korotkov sent in another revision of a patch to rework the access method interface.

SAWADA Masahiko sent in a patch to allow sending multiple options to ALTER ROLE...(RE)SET

Jeff Janes sent in another revision of a patch to eliminate spurious standby query cancellations.

Emre Hasegeli sent in another revision of a patch to add a BRIN correlation cost estimate.

SAWADA Masahiko sent in another revision of a patch to support N (N > 1) synchronous standby servers.

Vinayak Pokale sent in another revision of a patch to implement a VACUUM progress checker.

Teodor Sigaev sent in a patch to support OR clauses in index scans.

Joe Conway sent in another revision of a patch to expose pg_controldata and pg_config as functions.

Jeff Janes sent in a patch to avoid endless futile table locks in vacuuming.

Artur Zakirov sent in another revision of a patch to add fuzzy substring searching to the pg_trgm extension.

Peter Geoghegan sent in a patch to make some fixes and changes to the INSERT documentation.

par N Bougain le vendredi 1 janvier 2016 à 18h12

mercredi 30 décembre 2015

Nicolas Thauvin

Réflexions sur l'archivage des fichiers WAL

PostgreSQL faisant son bonhomme de chemin, on se retrouve désormais avec des configurations où il faut archiver plusieurs fois les WAL parce qu'on a de la sauvegarde PITR et de la réplication.

Le plus gros piège lorsqu'on a du PITR et de la réplication, ou plusieurs serveurs standby, c'est d'oublier que chaque élément de l'architecture qui consomme du WAL doit avoir son propre répertoire de WAL archivés, car chacun purge les fichiers différemment.

On tombe dans le piège facilement, en se disant, "pas de problème pour la purge des vieux fichiers, c'est le PITR ou le slave le plus éloigné qui purgera". Si la purge du PITR passe alors que le standby était déconnecté du maitre, cette purge peut casser la réplication.

La solution est d'archiver plusieurs fois le même fichier WAL. Pour optimiser, le plus efficace à l'usage est d'utiliser des hardlinks. En gros, on archive une fois le fichier WAL et on crée autant de lien hard qu'il faut pour les autres consommateurs de WAL archivés. Rappelons, que la donnée n'est supprimée que lorsqu'il n'existe plus aucun lien et que plus aucun processus n'a le fichier ouvert, à ne pas confondre avec un lien symbolique.

Pour archiver vite, il vaut mieux éviter de compresser et stocker les archives soit en local, soit sur un partage NFS, l'archivage par SSH restant le plus lent. Tout est compromis entre chiffrage des communications sur le réseau et espace disque disponible, les liens hard restant rapides à créer et avec une consommation d'espace disque supplémentaire négligeable.

Enfin, PostgreSQL exécute la commande d'archivage avec l'appel system() qui fork un shell : toutes les possibilités du shell sont alors disponibles, par exemple :

archive_command = 'rsync -a %p slave1:/archived_xlog/slave1/%f && ssh slave1 "for h in slave2 pitr; do ln /archived_xlog/slave1/%f /archived_xlog/$h/%f; done"'

Oui, une boucle et une seule copie avec rsync par SSH pour 3 utilisations. On préfèrera surement faire un script pour rendre les choses plus lisibles. Ça marche aussi pour restore_command et archive_cleanup_command dans recovery.conf :

restore_command = 'cp /archived_xlog/$(hostname)/%f %p'
archive_cleanup_command = 'pg_archivecleanup /archived_xlog/$(hostname) %r'

mercredi 30 décembre 2015 à 21h01

lundi 21 décembre 2015

Actualités PostgreSQL.fr

Nouvelles hebdomadaires de PostgreSQL - 20 décembre 2015

PostgreSQL 9.5RC1 disponible. À vos tests ! https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5

Le PGday annuel indien se tiendra à Bangalore (État du Karnataka en Inde) le 26 février 2016. L'appel à conférenciers est lancé : http://pgday.in

Le PGDay nordique, une série de conférences sur une seule journée, aura lieu à Helsinki (Finlande) le 17 mars 2016. L'appel à conférenciers est lancé : http://2016.nordicpgday.org/

Offres d'emplois autour de PostgreSQL en décembre

PostgreSQL Local

  • PostgreSQL@SCaLE est un événement de deux jours avec deux programmes qui aura lieu les 21 et 22 janvier 2016 au Pasadena Convention Center à l'occasion du SCaLE 14X. https://www.socallinuxexpo.org/scale/14x/cfp
  • FOSDEM PGDay est une conférence d'une journée qui sera tenue avant le FOSDEM à Bruxelles (Belgique) le 29 janvier 2015. Détails et appel à conférenciers ci-après : http://fosdem2016.pgconf.eu/
  • Prague PostgreSQL Developer Day 2016 (P2D2 2016) est une conférence sur deux jours, les 17 et 18 février 2016, à Prague (République Tchèque). Site en tchèque : http://www.p2d2.cz/
  • La première conférence PostgreSQL pan-asiatique se tiendra en mars 2016 à Singapour. L'appel à conférenciers est ouvert : http://2016.pgday.asia/
  • La PGConf US 2016 aura lieu les 18, 19 et 20 avril à New-York. L'appel à conférenciers expire au 31 janvier 2016, 23:59EST : http://www.pgconf.us/2016/
  • La PGCon 2016 se tiendra du 17 au 21 mai 2016 à Ottawa. L'appel à conférenciers a été lancé : http://www.pgcon.org/

PostgreSQL dans les média

PostgreSQL Weekly News / les nouvelles hebdomadaires vous sont offertes cette semaine par David Fetter. Traduction par l'équipe PostgreSQLFr sous licence CC BY-NC-SA. La version originale se trouve à l'adresse suivante : http://www.postgresql.org/message-id/20151221041252.GB16788@fetter.org

Proposez vos articles ou annonces avant dimanche 15:00 (heure du Pacifique). Merci de les envoyer en anglais à david (a) fetter.org, en allemand à pwn (a) pgug.de, en italien à pwn (a) itpug.org et en espagnol à pwn (a) arpug.com.ar.

Correctifs appliqués

Andres Freund pushed:

  • Correct statement to actually be the intended assert statement. e3f4cfc7 introduced a LWLockHeldByMe() call, without the corresponding Assert() surrounding it. Spotted by Coverity. Backpatch: 9.1+, like the previous commit http://git.postgresql.org/pg/commitdiff/2a3544960eaa114d34e5e83ab19e180c8efcd299
  • Fix bug in SetOffsetVacuumLimit() triggered by find_multixact_start() failure. Previously, if find_multixact_start() failed, SetOffsetVacuumLimit() would install 0 into MultiXactState->offsetStopLimit if it previously succeeded. Luckily, there are no known cases where find_multixact_start() will return an error in 9.5 and above. But if it were to happen, for example due to filesystem permission issues, it'd be somewhat bad: GetNewMultiXactId() could continue allocating mxids even if close to a wraparound, or it could erroneously stop allocating mxids, even if no wraparound is looming. The wrong value would be corrected the next time SetOffsetVacuumLimit() is called, or by a restart. Reported-By: Noah Misch, although this is not his preferred fix Discussion: 20151210140450.GA22278@alap3.anarazel.de Backpatch: 9.5, where the bug was introduced as part of 4f627f http://git.postgresql.org/pg/commitdiff/cca705a5d93446e1a9c775b94c7d5900986c0488
  • Fix tab completion for ALTER ... TABLESPACE ... OWNED BY. Previously the completion used the wrong word to match 'BY'. This was introduced brokenly, in b2de2a. While at it, also add completion of IN TABLESPACE ... OWNED BY and fix comments referencing nonexistent syntax. Reported-By: Michael Paquier Author: Michael Paquier and Andres Freund Discussion: CAB7nPqSHDdSwsJqX0d2XzjqOHr==HdWiubCi4L=Zs7YFTUne8w@mail.gmail.com Backpatch: 9.4, like the commit introducing the bug http://git.postgresql.org/pg/commitdiff/130d94a7b868f5b6df512e5fde94a64e5e71178b

Heikki Linnakangas pushed:

  • Fix out-of-memory error handling in ParameterDescription message processing. If libpq ran out of memory while constructing the result set, it would hang, waiting for more data from the server, which might never arrive. To fix, distinguish between out-of-memory error and not-enough-data cases, and give a proper error message back to the client on OOM. There are still similar issues in handling COPY start messages, but let's handle that as a separate patch. Michael Paquier, Amit Kapila and me. Backpatch to all supported versions. http://git.postgresql.org/pg/commitdiff/7b96bf445a42b1cb2a435854f9825c38253f79a2

Kevin Grittner pushed:

  • Remove xmlparse(document '') test This one test was behaving differently between the ubuntu fix for CVE-2015-7499 and the base "expected" file. It's not worth having yet another version of the expected file for this test, so drop it. Perhaps at some point when all distros have settled down to the same behavior on this test, it can be restored. Problem found by me on libxml2 (2.9.1+dfsg1-3ubuntu4.6). Solution suggested by Tom Lane. Backpatch to 9.5, where the test was added. http://git.postgresql.org/pg/commitdiff/e2f1765ce0770e813971336bb4603099d24cbe57

Ãlvaro Herrera pushed:

  • Add missing CHECK_FOR_INTERRUPTS in lseg_inside_poly. Apparently, there are bugs in this code that cause it to loop endlessly. That bug still needs more research, but in the meantime it's clear that the loop is missing a check for interrupts so that it can be cancelled timely. Backpatch to 9.1 -- this has been missing since 49475aab8d0d. http://git.postgresql.org/pg/commitdiff/0d8f3d5d11f7304c82ce1383bbb491ec6abcffc4
  • Rework internals of changing a type's ownership. This is necessary so that REASSIGN OWNED does the right thing with composite types, to wit, that it also alters ownership of the type's pg_class entry -- previously, the pg_class entry remained owned by the original user, which caused later other failures such as the new owner's inability to use ALTER TYPE to rename an attribute of the affected composite. Also, if the original owner is later dropped, the pg_class entry becomes owned by a non-existant user which is bogus. To fix, create a new routine AlterTypeOwner_oid which knows whether to pass the request to ATExecChangeOwner or deal with it directly, and use that in shdepReassignOwner rather than calling AlterTypeOwnerInternal directly. AlterTypeOwnerInternal is now simpler in that it only modifies the pg_type entry and recurses to handle a possible array type; higher-level tasks are handled by either AlterTypeOwner directly or AlterTypeOwner_oid. I took the opportunity to add a few more objects to the test rig for REASSIGN OWNED, so that more cases are exercised. Additional ones could be added for superuser-only-ownable objects (such as FDWs and event triggers) but I didn't want to push my luck by adding a new superuser to the tests on a backpatchable bug fix. Per bug #13666 reported by Chris Pacejo. Backpatch to 9.5. (I would back-patch this all the way back, except that it doesn't apply cleanly in 9.4 and earlier because 59367fdf9 wasn't backpatched. If we decide that we need this in earlier branches too, we should backpatch both.) http://git.postgresql.org/pg/commitdiff/756e7b4c9db1fa713b886068643257c823baddaf

Tom Lane pushed:

  • Add missing cleanup logic in pg_rewind/t/005_same_timeline.pl test. Per Michael Paquier http://git.postgresql.org/pg/commitdiff/db81329eed6b1f54bbdd9049bcdba556f2b4737d
  • Update 9.5 release notes through today. Also do another round of copy-editing, and fix up remaining FIXME items. http://git.postgresql.org/pg/commitdiff/bfc7f5dd5dc641b475c27b872d6df21c20c75af1
  • Document use of Subject Alternative Names in SSL server certificates. Commit acd08d764 did not bother with updating the documentation. http://git.postgresql.org/pg/commitdiff/0625dbb0b96e2ecd557eb5bcdc458679123951db
  • Cope with Readline's failure to track SIGWINCH events outside of input. It emerges that libreadline doesn't notice terminal window size change events unless they occur while collecting input. This is easy to stumble over if you resize the window while using a pager to look at query output, but it can be demonstrated without any pager involvement. The symptom is that queries exceeding one line are misdisplayed during subsequent input cycles, because libreadline has the wrong idea of the screen dimensions. The safest, simplest way to fix this is to call rl_reset_screen_size() just before calling readline(). That causes an extra ioctl(TIOCGWINSZ) for every command; but since it only happens when reading from a tty, the performance impact should be negligible. A more valid objection is that this still leaves a tiny window during entry to readline() wherein delivery of SIGWINCH will be missed; but the practical consequences of that are probably negligible. In any case, there doesn't seem to be any good way to avoid the race, since readline exposes no functions that seem safe to call from a generic signal handler --- rl_reset_screen_size() certainly isn't. It turns out that we also need an explicit rl_initialize() call, else rl_reset_screen_size() dumps core when called before the first readline() call. rl_reset_screen_size() is not present in old versions of libreadline, so we need a configure test for that. (rl_initialize() is present at least back to readline 4.0, so we won't bother with a test for it.) We would need a configure test anyway since libedit's emulation of libreadline doesn't currently include such a function. Fortunately, libedit seems not to have any corresponding bug. Merlin Moncure, adjusted a bit by me http://git.postgresql.org/pg/commitdiff/2ec477dc8108339dcb6bb944fa93d19cafb6fff7
  • Fix improper initialization order for readline. Turns out we must set rl_basic_word_break_characters *before* we call rl_initialize() the first time, because it will quietly copy that value elsewhere --- but only on the first call. (Love these undocumented dependencies.) I broke this yesterday in commit 2ec477dc8108339d; like that commit, back-patch to all active branches. Per report from Pavel Stehule. http://git.postgresql.org/pg/commitdiff/aee7705be5b75d8e7873a32c4a0dd0afe1ae5928
  • Adjust behavior of single-user -j mode for better initdb error reporting. Previously, -j caused the entire input file to be read in and executed as a single command string. That's undesirable, not least because any error causes the entire file to be regurgitated as the "failing query". Some experimentation suggests a better rule: end the command string when we see a semicolon immediately followed by two newlines, ie, an empty line after a query. This serves nicely to break up the existing examples such as information_schema.sql and system_views.sql. A limitation is that it's no longer possible to write such a sequence within a string literal or multiline comment in a file meant to be read with -j; but there are no instances of such a problem within the data currently used by initdb. (If someone does make such a mistake in future, it'll be obvious because they'll get an unterminated-literal or unterminated-comment syntax error.) Other than that, there shouldn't be any negative consequences; you're not forced to end statements that way, it's just a better idea in most cases. In passing, remove src/include/tcop/tcopdebug.h, which is dead code because it's not included anywhere, and hasn't been for more than ten years. One of the debug-support symbols it purported to describe has been unreferenced for at least the same amount of time, and the other is removed by this commit on the grounds that it was useless: forcing -j mode all the time would have broken initdb. The lack of complaints about that, or about the missing inclusion, shows that no one has tried to use TCOP_DONTUSENEWLINE in many years. http://git.postgresql.org/pg/commitdiff/66d947b9d302f1fd6de3d156e6ec61f52e1dc2cb
  • Use just one standalone-backend session for initdb's post-bootstrap steps. Previously, each subroutine in initdb fired up its own standalone backend session. Over time we'd grown as many as fifteen of these sessions, and the cumulative startup and shutdown work for them was getting pretty noticeable. Combining things so that all these steps share a single backend session cuts a good 10% off the total runtime of initdb, more if you're not fsync'ing. The main stumbling block to doing this before was that some of the sessions were run with -j and some not. The improved definition of -j mode implemented by my previous commit makes it possible to fix that by running all the post-bootstrap steps with -j; we just have to use double instead of single newlines to end command strings. (This is only absolutely necessary around the VACUUM and CREATE DATABASE steps, since those can't be run in a transaction block. But it seems best to make them all use double newlines so that the commands remain separate for error-reporting purposes.) A minor disadvantage is that since initdb can't tell how much of its output the backend has executed, we can no longer have the per-step progress reporting initdb used to print. But things are fast enough nowadays that that's not really all that useful anyway. In passing, add more const decoration to some of the static arrays in initdb.c. http://git.postgresql.org/pg/commitdiff/c4a8812cf64b142685e39a69694c5276601f40e4
  • Remove unreferenced function declarations. datapagemap_create() and datapagemap_destroy() were declared extern, but they don't actually exist anywhere. Per YUriy Zhuravlev and Michael Paquier. http://git.postgresql.org/pg/commitdiff/3d0c50ffa0bdb683c28bfe0e79d23d87111da2aa
  • Adopt a more compact, less error-prone notation for tab completion code. Replace tests like else if (pg_strcasecmp(prev4_wd, "CREATE") == 0 && pg_strcasecmp(prev3_wd, "TRIGGER") == 0 && (pg_strcasecmp(prev_wd, "BEFORE") == 0 || pg_strcasecmp(prev_wd, "AFTER") == 0)) with new notation like this: else if (TailMatches4("CREATE", "TRIGGER", MatchAny, "BEFORE|AFTER")) In addition, provide some macros COMPLETE_WITH_LISTn() to reduce the amount of clutter needed to specify a small number of predetermined completion alternatives. This makes the code substantially more compact: tab-complete.c gets over a thousand lines shorter in this patch, despite the addition of a couple of hundred lines of infrastructure for the new notations. The new way of specifying match rules seems a whole lot more readable and less error-prone, too. There's a lot more that could be done now to make matching faster and more reliable; for example I suspect that most of the TailMatches() rules should now be Matches() rules. That would allow them to be skipped after a single integer comparison if there aren't the right number of words on the line, and it would reduce the risk of unintended matches. But for now, (mostly) refrain from reworking any match rules in favor of just converting what we've got into the new notation. Thomas Munro, reviewed by Michael Paquier, some adjustments by me http://git.postgresql.org/pg/commitdiff/d37b816dc9e8f976c8913296781e08cbd45c5af1
  • Add missing COSTS OFF to EXPLAIN commands in rowsecurity.sql. Commit e5e11c8cc added a bunch of EXPLAIN statements without COSTS OFF to the regression tests. This is contrary to project policy since it results in unnecessary platform dependencies in the output (it's just luck that we didn't get buildfarm failures from it). Per gripe from Mike Wilson. http://git.postgresql.org/pg/commitdiff/654218138b819df66c1b90d39a12ca6a75b9ff65
  • Teach psql's tab completion to consider the entire input string. Up to now, the tab completion logic has only examined the last few words of the current input line; "last few" being originally as few as four words, but lately up to nine words. Furthermore, it only looked at what libreadline considers the current line of input, which made it rather myopic if you split your command across lines. This was tolerable, sort of, so long as the match patterns were only designed to consider the last few words of input; but with the recent addition of HeadMatches() and Matches() matching rules, we really have to do better if we want those to behave sanely. Hence, change the code to break the entire line down into words, and to include any previous lines in the command buffer along with the active readline input buffer. This will be a little bit slower than the previous coding, but some measurements say that even a query of several thousand characters can be parsed in a hundred or so microseconds on modern machines; so it's really not going to be significant for interactive tab completion. To reduce the cost some, I arranged to avoid the per-word malloc calls that used to occur: all the words are now kept in one malloc'd buffer. http://git.postgresql.org/pg/commitdiff/d854118c8df8c413d069f7e88bb01b9e18e4c8ed
  • Remove silly completion for "DELETE FROM tabname ...". psql offered USING, WHERE, and SET in this context, but SET is not a valid possibility here. Seems to have been a thinko in commit f5ab0a14ea83eb6c which added DELETE's USING option. http://git.postgresql.org/pg/commitdiff/99ccb2309263183f0f3d838b79f3e07ad8cc6a63

Stephen Frost pushed:

  • Collect the global OR of hasRowSecurity flags for plancache We carry around information about if a given query has row security or not to allow the plancache to use that information to invalidate a planned query in the event that the environment changes. Previously, the flag of one of the subqueries was simply being copied into place to indicate if the query overall included RLS components. That's wrong as we need the global OR of all subqueries. Fix by changing the code to match how fireRIRules works, which is results in OR'ing all of the flags. Noted by Tom. Back-patch to 9.5 where RLS was introduced. http://git.postgresql.org/pg/commitdiff/e5e11c8cca7ae298895430102217fa6d77cfb2a3
  • Improve CREATE POLICY documentation Clarify that SELECT policies are now applied when SELECT rights are required for a given query, even if the query is an UPDATE or DELETE query. Pointed out by Noah. Additionally, note the risk regarding concurrently open transactions where a relation which controls access to the rows of another relation are updated and the rows of the primary relation are also being modified. Pointed out by Peter Geoghegan. Back-patch to 9.5. http://git.postgresql.org/pg/commitdiff/43cd468cf01007f39312af05c4c92ceb6de8afd8

Robert Haas pushed:

  • Provide a way to predefine LWLock tranche IDs. It's a bit cumbersome to use LWLockNewTrancheId(), because the returned value needs to be shared between backends so that each backend can call LWLockRegisterTranche() with the correct ID. So, for built-in tranches, use a hard-coded value instead. This is motivated by an upcoming patch adding further built-in tranches. Andres Freund and Robert Haas http://git.postgresql.org/pg/commitdiff/3fed417452b226d9bd85a3a54d7056b06eb14897
  • Move buffer I/O and content LWLocks out of the main tranche. Move the content lock directly into the BufferDesc, so that locking and pinning a buffer touches only one cache line rather than two. Adjust the definition of BufferDesc slightly so that this doesn't make the BufferDesc any larger than one cache line (at least on platforms where a spinlock is only 1 or 2 bytes). We can't fit the I/O locks into the BufferDesc and stay within one cache line, so move those to a completely separate tranche. This leaves a relatively limited number of LWLocks in the main tranche, so increase the padding of those remaining locks to a full cache line, rather than allowing adjacent locks to share a cache line, hopefully reducing false sharing. Performance testing shows that these changes make little difference on laptop-class machines, but help significantly on larger servers, especially those with more than 2 sockets. Andres Freund, originally based on an earlier patch by Simon Riggs. Review and cosmetic adjustments (including heavy rewriting of the comments) by me. http://git.postgresql.org/pg/commitdiff/6150a1b08a9fe7ead2b25240be46dddeae9d98e1
  • Teach mdnblocks() not to create zero-length files. It's entirely surprising that mdnblocks() has the side effect of creating new files on disk, so let's make it not do that. One consequence of the old behavior is that, if running on a damaged cluster that is missing a file, mdnblocks() can recreate the file and allow a subsequent _mdfd_getseg() for a higher segment to succeed. This happens because, while mdnblocks() stops when it finds a segment that is shorter than 1GB, _mdfd_getseg() has no such check, and thus the empty file created by mdnblocks() can allow it to continue its traversal and find higher-numbered segments which remain. It might be a good idea for _mdfd_getseg() to actually verify that each segment it finds is exactly 1GB before proceeding to the next one, but that would involve some additional system calls, so for now I'm just doing this much. Patch by me, per off-list analysis by Kevin Grittner and Rahila Syed. Review by Andres Freund. http://git.postgresql.org/pg/commitdiff/049469e7e7cfe0c69d30385952e2576b63230283
  • Mark CHECK constraints declared NOT VALID valid if created with table. FOREIGN KEY constraints have behaved this way for a long time, but for some reason the behavior of CHECK constraints has been inconsistent up until now. Amit Langote and Amul Sul, with assorted tweaks by me. http://git.postgresql.org/pg/commitdiff/f27a6b15e6566fba7748d0d9a3fc5bcfd52c4a1b
  • Speed up CREATE INDEX CONCURRENTLY's TID sort. Encode TIDs as 64-bit integers to speed up comparisons. This seems to speed things up on all platforms, but is even more beneficial when 8-byte integers are passed by value. Peter Geoghegan. Design suggestions and review by Tom Lane. Review also by Simon Riggs and by me. http://git.postgresql.org/pg/commitdiff/b648b70342fbe712383e8cd76dc8f7feaba9aaa3
  • Fix typo in comment. Amit Langote http://git.postgresql.org/pg/commitdiff/9a51698bae86f748279ecedcae018925b5af5b2d
  • Fix copy-and-paste error in logical decoding callback. This could result in the error context misidentifying where the error actually occurred. Craig Ringer http://git.postgresql.org/pg/commitdiff/4496226782c47e78b428babbcc16dec4f7329f2b
  • Fix TupleQueueReaderNext not to ignore its nowait argument. This was a silly goof on my (rhaas's) part. Report and fix by Rushabh Lathia. http://git.postgresql.org/pg/commitdiff/2bdfcb52c5d1446a1f19cc8bf16d44911658bcac
  • Remove duplicate word. Kyotaro Horiguchi http://git.postgresql.org/pg/commitdiff/6e7b335930200f71115fccd4903d04fe4de42021
  • pgbench: Change terminology from "threshold" to "parameter". Per a recommendation from Tomas Vondra, it's more helpful to refer to the value that determines how skewed a Gaussian or exponential distribution is as a parameter rather than a threshold. Since it's not quite too late to get this right in 9.5, where it was introduced, back-patch this. Most of the patch changes only comments and documentation, but a few pgbench messages are altered to match. Fabien Coelho, reviewed by Michael Paquier and by me. http://git.postgresql.org/pg/commitdiff/3c7042a7d7871b47dae3c9777c8020e41dedee89

Teodor Sigaev pushed:

Peter Eisentraut pushed:

Correctifs rejetés (à ce jour)

No one was disappointed this week :-)

Correctifs en attente

Amit Langote sent in another revision of a patch to fix an error in find_inheritance_children().

Craig Ringer sent in another revision of a patch to add logical decoding for sequence advances.

Tomas Vondra sent in a patch to postpone building buckets to the end of Hash (in HashJoin).

Aleksander Alekseev sent in another revision of a patch to speed up identification of resource owners, one outcome of which will be to speed up accesses to tables with many partitions.

Daniel Verité and Pavel Stěhule traded patches to implement \crosstabview in psql.

Michael Paquier sent in three more revisions of a patch to extend pgbench expressions with functions.

Andreas Karlsson sent in a patch to make it possible to cancel a query which is running the crypt() function with the bf or xdes hashing algorithm by adding a CHECK_FOR_INTERRUPTS() call every round.

Andres Freund sent in a patch to make a faster PageIsVerified() for the all zeroes case and improve scalability of md.c for large relations.

David Rowley sent in three more revisions of a patch to improve for joins where outer side is unique.

Aleksander Alekseev sent in a patch to ensure that the initial sizes of the PROCLOCK and LOCK hash tables be the same as max_size, saving time that would have been spent expanding them.

Haribabu Kommi sent in another revision of a patch to add parallel aggregation.

Tomas Vondra sent in a patch to add a bloom filter in Hash Joins with batches.

Haribabu Kommi sent in another revision of a patch to add a pg_hba_lookup() function to get all matching pg_hba.conf entries.

Fabien COELHO sent in another revision of a patch to add pgbench stats per script, etc.

Haribabu Kommi sent in another revision of a patch to implement multi-tenancy with RLS.

Jesper Pedersen sent in another revision of a patch to add new LWLOCK_STATS statistics.

Tatsuro Yamada sent in a patch to fix a typo in regress/sql/privileges.sql.

David Rowley sent in another revision of a patch to implement combining aggregates, useful in parallelizing aggregation.

SAWADA Masahiko sent in another revision of a patch to support for N synchronous standby servers for N>1.

Amit Langote sent in a patch to fix a comment typo in pg_upgrade.c.

Aleksander Alekseev sent in a patch to fix lock contention for HASHHDR.mutex.

Michael Paquier sent in a patch to fix a psql call in pg_upgrade.

SAWADA Masahiko sent in another revision of a patch to add a "frozen" bit to the visibility map.

Peter Geoghegan sent in another revision of a patch to reuse abbreviated keys during second pass of ordered [set] aggregates.

Ashutosh Bapat and Robert Haas traded patches to allow foreign scans to be sorted on the remote side.

Alexander Korotkov sent in a patch to improve statistics for array types.

Michael Paquier sent in a patch to add a system view and function to report WAL receiver.

Robert Haas sent in another revision of a patch to implement parallel joins, and better parallel explain.

Artur Zakirov sent in a patch to add fuzzy substring searching with the pg_trgm extension.

David Rowley sent in a patch to wrap error code paths in unlikely().

Mithun Cy sent in a patch to cache data in GetSnapshotData().

Pavel Stěhule sent in another revision of a patch to add a pg_size_bytes() to transform human-readable byte sizes into bytes.

Michael Paquier sent in another revision of a patch to reduce some unneeded WAL logging in hot standby.

par N Bougain le lundi 21 décembre 2015 à 16h04

jeudi 19 novembre 2015

Guillaume Lelarge

Version finale du livre

Elle n'est pas encore sortie. Elle est pratiquement terminée, on attend d'avoir le livre en version imprimée.

Néanmoins, je peux déjà dire les nouveautés par rapport à la beta 0.4 :

  • Global
    • mise à jour du texte pour la 9.5
    • ajout du chapitre sur la sécurité
    • ajout du chapitre sur la planification
    • mise à jour des exemples avec PostgreSQL 9.5 beta 1
  • Fichiers
    • Ajout d'un schéma sur les relations entre tables, FSM et VM
    • Ajout de la description des répertoires pg_dynshmem et pg_logical
  • Contenu des fichiers
    • Ajout d'informations sur le stockage des données, colonne par colonne
    • Ajout d'un schéma sur la structure logique et physique d'un index B-tree
    • Ajout de la description des index GIN
    • Ajout de la description des index GiST
    • Ajout de la description des index SP-GiST
  • Architecture mémoire
    • calcul du work_mem pour un tri
    • calcul du maintenance_work_mem pour un VACUUM
  • Gestion des transactions
    • Gestion des verrous et des accès concurrents
  • Maintenance
    • Description de la sortie d'un VACUUM VERBOSE

J'avoue que j'ai hâte d'avoir la version finale entre mes mains :-) Bah, oui, c'est quand même 1 an et demi de boulot acharné !

par Guillaume Lelarge le jeudi 19 novembre 2015 à 22h36

mercredi 21 octobre 2015

Nicolas Thauvin

pitrery 1.10

Un semaine après la sortie de la version 1.9, un collègue découvre un bug dans le strict de restore des fichiers WAL, restore_xlog. Lorsqu'on utilise SSH pour stocker les archives des fichiers WAL, le script ne tient pas compte des paramètres spécifiant l'utilisateur et la machine distante en provenance du fichier de configuration.

Ce qui fait que la restore ne fonctionne pas lorsqu'on démarre PostgreSQL à moins de modifier l'appel à restore_xlog dans le paramètre restore_command du fichier recovery.conf pour y indiquer ces deux informations, car en ligne de commande ils sont bien pris en compte.

On peut aussi ajouter dans le fichier de configuration :

RESTORE_COMMAND="/path/to/restore_xlog -C <config_file> -h <archive_host> -u <archive_user> %f %p"

La version 1.10 corrige ce problème, voir le site de pitrery.

mercredi 21 octobre 2015 à 16h42

mardi 22 septembre 2015

Guillaume Lelarge

Version beta 0.4 du livre

La dernière beta datait de mi-mai. Beaucoup de choses se sont passées pendant les 4 mois qui ont suivi. Quatre nouveaux chapitres sont mis à disposition :

  • Sauvegarde
  • Réplication
  • Statistiques
  • Maintenance

Mais ce n'est évidemment pas tout. Dans les nouveautés importantes, notons :

  • Chapitres Fichiers, Processus et Mémoire
    • Ajout des schémas disques/processus/mémoire
  • Chapitre Contenu physique des fichiers
    • Déplacement des informations sur le contenu des journaux de transactions dans ce chapitre
    • Ajout de la description du contenu d'un index B-tree
    • Ajout de la description du contenu d'un index Hash
    • Ajout de la description du contenu d'un index BRIN
    • Restructuration du chapitre dans son ensemble
  • Chapitre Architecture des processus
    • Ajout de sous-sections dans la description des processus postmaster et startup
    • Ajout d'un exemple sur la mort inattendue d'un processus du serveur PostgreSQL
  • Chapitre Architecture mémoire
    • Ajout de plus de détails sur la mémoire cache (shared_buffers)
  • Chapitre Gestion des transactions
    • Ajout d'informations sur le CLOG, le FrozenXid et les Hint Bits
  • Chapitre Gestion des objets
    • Ajout d'une section sur les options spécifiques des vues et fonctions pour la sécurité
    • Ajout d'un paragraphe sur le pseudo-type serial
  • Divers
    • Mise à jour des exemples avec PostgreSQL 9.4.4

Bref, c'est par ici.

Quant à la prochaine version ? cela devrait être la version finale. Elle comportera le chapitre Sécurité (déjà écrit, en cours de relecture) et le chapitre sur le planificateur de requêtes (en cours d'écriture). Elle devrait aussi disposer d'une mise à jour complète concernant la version 9.5 (dont la beta devrait sortir début octobre).

Bonne lecture et toujours intéressé pour savoir ce que vous en pensez (via la forum mis en place par l'éditrice ou via mon adresse email).

par Guillaume Lelarge le mardi 22 septembre 2015 à 21h59

lundi 10 août 2015

Rodolphe Quiédeville

Utiliser pg_shard avec Django

L'hiver dernier CitusData à ouvert le code source de son outil de partitionnement pg_shard, le code est désormais publié sous licence LGPL version 3, et disponible sur github. Le 30 juillet dernier la version 1.2 a été releasé, ce fut l'occasion pour moi de tester la compatibilité de Django avec cette nouvelle extension PostgreSQL.

Pour rappel le sharding permet de distribuer le contenu d'une table sur plusieurs serveurs, pg_shard permet également de gérer de multiples copies d'un même réplicats afin de palier à une éventulle faille sur l'un des noeuds. L'intérêt principal du sharding est de pouvoir garantir la scalabilité quand le volume de données augmente rapidement, l'accés aux données se faisant toujours sur le noeud principal sans avoir à prendre en compte les noeuds secondaires qui sont trasparents pour le client.

Autant le dire tout de suite, et ne pas laisser le suspens s'installer, Django n'est pas compatible avec pg_shard, cela pour trois raisons principales détaillée ci-dessous. D'auutres points sont peut-être bloquant, mais je n'ai pas introspecté plus en avant après avoir déjà constaté ces premiers points de blocage.

Lors de la sauvegarde d'un nouvel objet dans la base Django utilise la clause RETURNING dans l'INSERT afin de récupérer l'id de l'objet. A ce jour pg_shard ne supporte pas RETURNING, un ticket est en cours, espérons qu'une future version soit publiée avec cette fonctionnalité.

Plus problématique car cela demanderai un hack un peu plus profond dans l'ORM de Django, le non support des séquences qui sont utilisées par le type SERIAL afin de bénéficier de la numérotation automatique et unique des clés primaires. C'est ce type qui est utilisé par défaut par Django pour les pk. Là encore des discussions sont en cours pour supporter les sequences dans pg_shard.

Enfin et c'est peut-être ce qui serait le plus bloquant pour une utilisation avec Django ou un autre ORM, pg_shard ne supporte pas les transactions multi-requêtes. Les transactions étant la base de la garantie de l'intégrité des données ; à part être dans un cas d'usage où l'on ne modifie pas plus d'une donnée à la fois, cela peut être une raison pour ne pas adopter pg_shard dans l'état.

Malgré ces constats pg_shard reste une solution très intéressante, qu'il faut garder dans un coin de sa veille techno, à l'époque où le big data revient si souvent dans les conversations autour de la machine à café.

par Rodolphe Quiédeville le lundi 10 août 2015 à 10h31