FeedLounge now running on PostgreSQL

This post contains some technical information that will be very interesting to other developers, but may not be so interesting to those that don’t work with databases.

For those of you that just want the nutshell:

We’ve switched to new database server software because we think the new software will be faster and store the same data while using less disk space (making our storage costs lower).

Now for the detailed explanation:

Back in August, we switched the FeedLounge database from MySQL’s MyISAM tables to InnoDB tables. This solved a number of problems that we were having with the MyISAM tables, but also introduced a number of other issues including slower performance on a number of queries (counting in particular). As we did more and more research into how to solve these problems, we decided that one promising solution would be to switch to a PostgreSQL database instead of a MySQL database altogether.

That is what we were testing this weekend, and the alpha server is now running live with a PostgreSQL database instead of the MySQL database. This is still in “testing” mode - if we run into any serious issues we might switch back (temporarily or permanently) to the MySQL database (which will have data current through last Friday evening).

Some of the reasons we decided to make the switch to PostgreSQL:

  • Database size - when we switched from MyISAM tables to InnoDB tables in MySQL, the size of our database grew from ~1GB to 10+GB! When we made the switch this weekend, the MySQL InnoDB database was using 34 GB, and the same data in a PostgreSQL database is only 9.6 GB - this should keep our hardware costs down a bit.
  • Load time - The current MySQL setup takes over a day to restore the current database, the load of the data into the PostgreSQL database took just over 4 hours.

I was once given a simplistic summary of the MySQL and PostgreSQL databases:

MySQL was built with speed as the primary feature, PostgreSQL was built with more robust features like triggers, but lacked the speed of MySQL. As both are maturing, they are moving towards the other. MySQL 5 adds triggers and stored procedures, while PostgreSQL is focused on improving performance.

For an application like FeedLounge, the faster counting and smaller row size of PostgreSQL are compelling reasons to use it instead of MySQL with InnoDB tables. Add that it runs in 1/3 the time, using 1/3 the memory (making it essentially 10x faster for us) and that we can start taking advantage of other Postgres features, like data partitioning and partial indices in the near future, and you have the reason we decided to make the switch.

This week should be an interesting one, we’re looking forward to getting feedback from our alpha users on how FeedLounge is performing on the new database.

Posted November 20th, 2005 @ 10:50 PM in System, Development by Alex

78 Replies to “FeedLounge now running on PostgreSQL”

  1. FeedLounge Goes PostgreSQL

    The FeedLounge guys have moved over to PostgreSQL, and from talking to them while they were contemplating everything, this is a great move. Folks have probably been wondering what the bottlenecks in FeedLounge development have been, and I’m her…

    November 21st, 2005 at 7:49 am
  2. FeedLounge Switches To PostgreSQL

    This could turn into an interesting case study, FeedLounge has switched databases, from MySQL to PostgreSQL. The two main factors they listed were database size (34Gig for MySQL/InnoDB vs. 9.6Gig for PostgreSQL) and restoration time (24+ hours to res…

    November 21st, 2005 at 11:03 am
  3. Interesting.

    Two issues…

    Are you sure the table size actually grew? Did you try optimize table?

    WRT load time. What variables did you tune? MySQL is picky about internal variables when doing a load. You can really see a 100x increase in performance. It might be that Postgress is simply tuned better. This has bitten me before. ALTER table also has the same problem.

    WRT memory.. I’d be interested in knowing more here. I have an explicit memory config so I know how much memory I’m using. Are you saying that cache btrees on Postgress are smaller? That would certainly be nice! The lower the memory -> disk ration the better.

    Also…. do you store blobs in the DB? You might want to move to a distributed flat filesystem… I can license you one :)

    Kevin

    November 22nd, 2005 at 10:43 pm
  4. I’ll let Scott handle most of these, but I know that we were optimizing the tables nearly weekly.

    November 22nd, 2005 at 11:17 pm
  5. We run PostgreSQL on a high traffic site at work and we have no problems with it at all. At the moment, we’re taking approximately 1,000,000 pages per month and it is laughing at it. What sort of load are you putting PostgreSQL under?

    Al.

    November 23rd, 2005 at 4:58 pm
  6. We were over 4 million queries a day on the MySQL database, I think the latest optimizations have reduced that a little, but I haven’t run stats since then. My guess would be in the 4 million a day range.

    November 23rd, 2005 at 5:11 pm
  7. I’ve never bothered to look at the total number of queries, I’ll see if I can garner that information and get back here; just for the sake of comparison.

    Al.

    November 23rd, 2005 at 5:34 pm
  8. Kevin,

    The tables were optimized weekly, which took about 5 hours on the last optimize cycle.

    With regard to load, I was only using that as a comparison. Given our application, MySQL InnoDB runs the queries that FeedLounge uses on a daily basis in 3 times the amount of time, given 3 times the memory. So we are comparing apples to apples (our application in both instances), and Postgres is trouncing MySQL InnoDB once the indices can no longer stay in memory.

    My memory config on the live server was 6GB to the InnoDB buffer pool, with an O_DIRECT flush config. With Postgres, I have allowed it to take up to 2GB of RAM, but it has so far only used about 1.2.

    While we do not currently store blobs in the database, do let me know about your distributed flat filesystem. Send details to my apache address (you should know it), I would be interested in hearing more.

    November 24th, 2005 at 10:33 am
  9. Our users (myself included) have noticed a difference too.

    November 24th, 2005 at 10:43 am
  10. You might be misinterpreting the part about memory usage.

    PostgreSQL may not be set up to use all your RAM as shared_buffers, but it definitely benefits from the OS cache, so it doesn’t really use 1/3 as much RAM as MySQL. Actually, you very rarely want to give it as much as 2GB of RAM — shared_buffers would usually be closer to 50000 (300MB), just remember to set effective_cache_size right so the optimizer knows you have ~5.5GB of cache.

    I’m not surprised at the other points, though :-)

    November 25th, 2005 at 7:41 am
  11. Sesse, you are correct. Nevertheless, MySQL is just not pulling its weight compared to Postgres.

    November 25th, 2005 at 9:53 am
  12. Alex,

    Welcome to the club. If you need help, there’s generally people on #postgresql at irc.freenode.net 24 hours a day.

    November 30th, 2005 at 6:46 pm
  13. Pawel adds this Comment:

    Hi,

    I was wondering what have you used to transfer such a extensive db.
    I’ve just installed postgres in hope that table partitioning will increase functionality of currently 1GB MyIsam table.

    Pawel

    December 1st, 2005 at 11:13 am
  14. FeedLounge moves to PostgreSQL

    FeedLounge is a nifty web based RSS/Atom reader. After getting fed up with MySQL, they moved to PostgreSQL and were fairly pleased with the results.

    December 1st, 2005 at 4:19 pm
  15. Pawel,

    We just dumped the database using mysqldump, and then loaded using the psql utility.

    December 1st, 2005 at 5:02 pm
  16. Rob adds this Comment:

    Yeah, mysql’s import is slow. I suppose you locked the tables manually and it still took one day?!

    December 2nd, 2005 at 2:49 am
  17. PostgreSQL scales

    Via zillablog, I was notified of FeedLounge switching to PostgreSQL FeedLounge is just another in a serious of webbased services switching their RDBMS away from MySQL. For one thing, it’s the features that’s driving this. Postgres just has more featu…

    December 2nd, 2005 at 7:50 am
  18. JDD adds this Comment:

    InnoDB introduced a “compact” row format for MySQL 5 that could help significantly with the issue of data size. For a database running MySQL 4, upgrade to version 5.0 and run “ALTER TABLE t ROW_FORMAT = COMPACT.”

    In order to see how much space this saves, you can run SHOW TABLE STATUS both before and after the “alter table” query, and look at the data_length and index_length fields.

    I have seen tables that are only half as large with the new format.

    December 2nd, 2005 at 4:06 pm
  19. Sounds like something we should check out - though half as large is still twice the size of the PostgreSQL data size.

    December 2nd, 2005 at 4:14 pm
  20. You might be interested in a PostgreSQL performance tuning article I wrote awhile back. You can find it at http://revsys.com/writings/postgresql-performance.html it combines a lot of the current “best practices” I’ve gleaned over the years working with PostgreSQL as well information from discussions on the postgresql-performance list.

    December 2nd, 2005 at 6:50 pm
  21. Alex, Scott,

    I’m one of the Wikipedia techs and also a MySQL support engineer and I and one of our performance people are curious about your situation. If you’re interested in our feedback and/or in satisfying our curiosity, please email or post somewhere this information and we’ll let you know what we see:

    SHOW VARIABLES;
    SHOW INNODB STATUS\G (ideally twice ten minutes apart under load)
    SHOW STATUS; (ideally twice ten minutes apart under load)
    SHOW TABLE STATUS;
    SHOW CREATE TABLE tablename for all tables
    Some of your most common queries and EXPLAIN whatever the query output is for them.

    Some description of the order in which records are added - in primary key order or not - would be interesting also, since out of order inserts can halve the fill factor of the InnoDB nodes in the worst case.

    Our email addresses are james and peter, both at mysql.com.

    In case you’re curious, the Wikimedia MySQL boxes do something between 200 and 400 million queries per day but they are quite powerful - typically dual Opterons with 16GB of RAM and six 15,000 RPM SCSI drives.

    December 2nd, 2005 at 8:41 pm
  22. james and peter,

    I may be able to bring the old database up under load some other time, as we have moved on. I don’t suppose giving you any of this information would help on an idle database?

    We add records in no order, as all keys are GUIDs. The other problem is not in the read queries per day, but more that the FeedLounge application is about 50/50 read/write. How many inserts per day do the Wikipedia MySQL boxes do?

    As far as explain output, almost every single read query works using simple indexes (2 or 3 columns at most). Again, we have never really had a problem when using read queries, it was when you added the write queries to the mix at a 50% ratio.

    To be quite honest, even Postgres will fall over eventually with the write load we are placing on the database. Partioning the data is the only choice in an application like this, and Postgres is only buying us some time to get that implemented.

    Also, are the Wikipedia boxes doing any query caching? I heard without query caching that you top out around 35-40 million queries/day.

    December 2nd, 2005 at 11:55 pm
  23. Scott,

    Some will be of use under load, no need for two of the SHOW INNODB STATUS and SHOW STATUS though, one of each would be fine. If you happen to have some of these in session logs from the past that would be interesting. Do you have a disk controller which does write caching?

    GUIDs are interesting. Do you use text or binary/numeric form of them? How are they built - some custom GUID format or a generic one? There’s a bit of discussion about GUIDs in the LiveJournal context involving MySQL’s head of architecture (krow) at http://www.livejournal.com/users/brad/2173718.html . InnoDB will try to put them in sorted primary key order so you’re likely to get close to worst case fill factor for the rows unless you tweak the InnoDB source or have a time- or insert-order related component at the beginning of the GUID. Or maybe feed source order if that would better match the order your queries want. That could substantially raise your cache hit rate, possibly changing it from one page per record to many records per page. With the high write load, my guess is that something which will place the keys in time order would work better than something matching query order.

    GUIDs tend not to do well for index storage efficiency, since InnoDB stores the uncompressed primary key in all secondary index records.

    Inserts are relatively uncommon on the Wikimedia boxes. SHOW STATUS reports 10 statements or 300 row inserts plus 28 updates per second on the box I just checked. InnoDB reports: Number of rows inserted 77860665, updated 28219927, deleted 25913064, read 11311558611 . 22.63 inserts/s, 7.50 updates/s, 6.06 deletes/s, 3459.07 reads/s.

    The query cache isn’t very useful for Wikimedia servers under normal load. It’s purged any time a table involved in the query is changed and the insert rate is high enough that the hit rate is 0.08/s on the slave I checked and 0.003/s on the master. If it wasn’t for failure conditions in other parts of the architecture which can make it useful, it would be turned off. We definitely don’t top out at only 35-40 million queries per day. More like ten times that and it would go higher with more drives per server. The query cache is great when it works, just doesn’t for this application.

    When trying MySQL 5.0, you’ll get better InnoDB insert performance if you use the InnoDB option innodb_support_xa=0 to disable support for XA distributed transactions and the extra fsyncs it requires. Support for them is on by default. The compact row format in 5.0 can make a big database size difference, particularly if you have many small fields.

    How you were preparing and loading your database dumps? Them being comparatively slow is interesting.

    You might also find MogileFS at http://www.livejournal.com/users/brad/2177644.html interesting, since it’s what LJ uses for a distributed image storage system.

    December 3rd, 2005 at 1:54 am
  24. Alex, I’d like permission to quote you for the PostgreSQL.org website. Please e-mail me ( josh — at — postgresql . org ), or comment below, if this is OK.

    December 3rd, 2005 at 3:47 pm
  25. James,

    Thanks for the candid response. This is absolutely the best blog comment conversation I have ever had, and I am glad it is not reducing to a flamewar :)

    I will send you the list of items in email, but the system will have not been under load (in fact mysqld is off right now). Short and sweet, MySQL was running on a dual Opteron (speed doesn’t matter, as the CPU usage was minimal) with 8 GB RAM, and I had the innodb_buffer_pool size set around 6GB. The disk array was 6 SATA 7200RPM 16MB hard drives in RAID 5 on a dedicated controller with 128MB cache.

    GUIDs are interesting, and we are using the worst-case scenario, user-readable GUIDs (the equivalent of CHAR(37)), generated either using the MySQL UUID() function, or some python code, located at http://www.cavebear.com/cblib/ka_guidgen.py. The discussion on livejournal is very interesting, and I chose GUIDs for the distributed generation reason. The intention of the entire FeedLounge architecture is to be able to distribute everything, including data in the database. All queries are setup such that they would only need to hit some subset of the data. I understood that I would be forcing data out of order, and I may have ways to mitigate that, though not as much as that discussion talks about.

    I knew I was walking into problems when converting from MyISAM to InnoDB, but I had no choice at the time. The table locking behavior in MyISAM in a 50% write environment is a deal-killer. I couldn’t scale past 10 users or so.

    I guess the conclusion is, that given the design/architecture choices that I made, InnoDB is not the optimal choice for my application. It doesn’t look like it will work without a significant amount of refactoring on the application side. I don’t think there is any configuration magic pill that will fix it.

    As far as FeedLounge goes, in the 4+ million queries/day that were happening, over 2 million were updates/inserts, and the updates/inserts were peaking in the 2000/second range (there is quite a bit of idle time where nothing happens). That seems to be 6 times what the Wikipedia boxes are doing. I knew the IO was my bottleneck, and you have just confirmed that the UUID choice in primary keys are forcing the index size to increase, which just increases the IO load. Even if all this was fixed, it still seems that InnoDB will be twice as large as Postgres, no?

    I am not so worried about the database dumps since I only have to do them very infrequently, I was just giving an idea of what I was seeing.

    I have looked into MogileFS, and will probably start using memcached first to offload as many reads as possible, and then perhaps move some things out to MogileFS. Right now, all feed item content is stored in a TEXT column.

    December 3rd, 2005 at 7:22 pm
  26. Peter Zaitsev adds this Comment:

    UUID is really poor choice for primary key for Innodb.

    The data is going to be clustered by it… and it is random which means you will insert in radom spots in giant BTREE (as it holds rows). Furthermore inserts in the middle will frequently result in page spits which causes IO and fragmentation.

    Now Innob uses PRIMARY KEY to reference all the rows, so
    if you have secondary key which is ie INT it will be really taking
    50 bytes or so because UUID key is stored with it.

    At least I’d try to use artificial auto_increment column as a primary key promoting UUID key to unique - which will effectively make table non clustered.

    In general I would not recommend to use UUID if you care about performance unless absolutely nescesary.

    If you do at least pack them to be char(16) binary which makes it more compact.

    Well… some day I guess we should have native UUID type in MySQL.

    December 3rd, 2005 at 8:47 pm
  27. What James is trying to say above is that if you have MySQL employees on your staff supporting your site that MySQL will work great.

    The good news is that by getting your site working on PostgreSQL you’ve already made the important first step required to getting MySQL AB to hire some of your volunteer developers. ;)

    December 4th, 2005 at 11:31 am
  28. Gregory,

    I don’t think James is trying to say that at all, he is just trying to say that MySQL doesn’t handle large primary keys inserted in non-natural order very well.

    That is, unless I totally missed something in the above discussion.

    December 4th, 2005 at 12:20 pm
  29. Well, I was being something of a smart ass, it’s fun sport around Wikipedia to answer questions about why we’re on MySQL by pointing out that MySQL AB has hired the person working on porting mediawiki to PgSQL. :)

    There was a bit more to my statement though, because it is the case that you’ll get better performance with a bunch of MySQL folks working on your site simply because MySQL requires more design concessions from the application than other modern databases, resulting in a design which is heavily married to MySQL.

    I struggle with MySQL all the time while working with Wikipedia… I have more expirence with Oracle and PostgreSQL, and I’m working on doing fancy reports, adhoc queries, etc against the Wikipedia databases and MySQL’s perpetual stupidity drives me nuts. It isn’t uncommon for my local backdated copy of Wikipeia in PostgreSQL to out perform MySQL by 1+ orders of magnatude on such queries. If we had not been able to put MySQL 5 on the development box I use at Wikipedia and get it replicating (which is pretty cool, btw, I give kudos to MySQL’s replication, it works pretty well), then I probably would have just given up entirely. Unfortunatly, MySQL 5 still leaves a lot to be desired with respect to performance on non-trivial queries vs PG, and isn’t yet stable in our enviroment (replicating from MySQL 4).

    December 4th, 2005 at 4:39 pm
  30. Gregory,

    Replication was a big reason to stay with MySQL, but ultimately the database MUST be partitioned to multiple machines anyway, so I plan on sprinkling in some MogileFS type functionality at the API DB interface level, when I find another round tuit.

    December 4th, 2005 at 4:49 pm
  31. Feedlounge switched from MySQL to PostgreSQL for good

    Feedlounge, the upcoming web-based feeds reader on steroids mentioned a while back that they switched their db-server from MySQL to PostgreSQL and by doing so, they saved alot of space and reduced their restore times dramatically.

    Some of the reason…

    December 5th, 2005 at 9:10 am
  32. Well it’s certainly been my thinking that wikipedia moving to postgresql was long overdue, for many of the same reasons mentioned here (less disk space, faster queries, faster load times) but also for some reason not mentioned here, like being able to use full-text searching and foriegn keys at the same time.

    BTW Gregory, it’s my understanding that someone has taken the wikipgedia code and gotten Slony replication going with it, so wikipedia+postgresql+replication is also a possibility.

    Now if only someone would hire me to work on the project ;-)

    December 7th, 2005 at 9:23 am
  33. Alex,

    what I have heard during the past few years, InnoDB tables usually take slightly less space than PostgreSQL tables.

    InnoDB stores the rows in a table in a B-tree, in the order of the PRIMARY KEY. In Oracle, this type of table storage is called an ‘index-organized table’, in Sybase and InnoDB, a ‘clustered index’. I think PostgreSQL does not support clustered indexes.

    In your case, like Peter Zaitsev said, if you are using a 37-byte UUID column as the primary key, that will cause a lot of extra space usage in InnoDB because each secondary index record must contain the primary key so that we can find the row once we have the secondary index record.

    You should probably use an artificially created unsigned INT or BIGINT as your primary key. That would save a lot of space in InnoDB. If you do not create a PRIMARY KEY and do not create any UNIQUE INDEX to the table, then InnoDB internally creates a 6-byte row id column to act as the clustered index key.

    The fastest way to load a table dump to InnoDB is usually to load the rows in the order of the PRIMARY KEY.

    You may want to use the my.cnf option:

    innodb_file_per_table

    because then each table is stored into its own .ibd file, and OPTIMIZE TABLE will free the extra disk space for the operating system to use. If you are using the traditional InnoDB storage model, then all tables are in ibdata files, and ibdata files never shrink.

    Also make sure that you do not have dangling transactions in the database for hours or days (note that also a SELECT query starts a transaction). Dangling transactions prevent InnoDB’s automatic purge from removing old versions of rows, and a heavily updated table can grow quite a bit because of this.

    Best regards,

    Heikki Tuuri
    Oracle/Innobase

    December 8th, 2005 at 8:18 am
  34. Heikki,

    Thanks for the response. The issue with using a GUID for the primary key allows n app servers creating records without involving the database until absolutely necessary.

    The reason they are readable chars is to allow humans to be able to work on the database with the help of some other program. I know there are tradeoffs in data storage and performance.

    The intention also is that there will ultimately be m databases, so while I could use a INT primary key, all the indices would still need to use the GUID.

    I am not concerned with InnoDB files never shrinking, just the difference in size. FeedLounge tables don’t typically shrink.

    As far as transactions go, FeedLounge auto-commits everything, as there is no need for transactionality in our application, by design.

    If I had to do it all over again, knowing what I know now, I would still do the same thing, but start with 2 databases from the beginning, baking it into the API. RSS apps are so data hungry that you won’t be able to keep it all in one database, unless you have a budget Oracle would be happy to help you with :)

    December 8th, 2005 at 10:55 am
  35. Scott,

    but you can create your table like this:

    CREAT TABLE t(
    a INT NOT NULL AUTO_INCREMENT,
    GUID CHAR(37),

    UNIQUE INDEX (GUID)) TYPE = InnoDB;

    The auto-inc column is just for the internal use of the database node. It does not need to have any globally unique value.

    This results in a storage structure equivalent to PostgreSQL and Oracle (not an index-organized table). The format of an index record for InnoDB is then:

    (column_value, a)

    where a can be seen as the ‘pointer’ to the heap of rows in the data file. In PostgreSQL, the index record is:

    (column_value, )

    In this way, with clustered indexes you CAN store the rows in a heap, just like in a non-clustered database. That is the reason why I never made an explicit heap storage structure to InnoDB.

    You do not need GUID stored physically to your secondary index records. If you do a query like this:

    SELECT * FROM … WHERE GUID = ‘342342′ AND …

    the SQL optimizer will naturally use the unique index on GUID to access the row.

    About InnoDB tuning: making the buffer pool and ib_logfiles big as recommended at
    http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html
    is important. If you are committing after each single row insert or update, then you may consider using innodb_flush_log_at_trx_commit=2 if you can sacrifice the last second of transactions in an operating system crash.

    Best regards,

    Heikki
    Oracle/Innobase

    December 8th, 2005 at 11:34 am
  36. Just as an FYI, PostgreSQL does do clustered indexes. The implementation seems to be different though. There is more info in the PostgreSQL docs for cluster:

    http://www.postgresql.org/docs/8.1/interactive/sql-cluster.html

    Or for a whole database, clusterdb:

    http://www.postgresql.org/docs/8.1/interactive/app-clusterdb.html

    December 8th, 2005 at 12:26 pm
  37. Wow, blogging has a point. Great article.
    -l

    December 15th, 2005 at 9:58 am
  38. For reference, FeedDigest is doing 4,150,000 queries a day, 2.3 million of which are INSERTs. This is on a dual Pentium 4 2.8GHz box with a gig of RAM. We’d like to have more RAM but our provider claims we can only go up to 1.5GB and that seems a bit pointless.. so we’ll consider clustering in future.

    But for now, it runs plenty fast enough. We had some problems, but learning MySQL, how it operates, and how to write the SQL well for it resolved them all, and now it hovers at a load of around 1.0, which is pretty good for a dual CPU box. Most of the load is iowait, so I’m sure we can optimize it to scale up to tens of millions of queries.. and I already know we’re doing some pointless/needless queries that could be optimized down.

    December 15th, 2005 at 12:57 pm
  39. Would be interested in knowing what type of tables your using Peter.

    December 15th, 2005 at 1:29 pm
  40. MyISAM.

    Considered InnoDB for a while as I thought I needed something it offered (row level locking) but it turns out I was approaching the problem wrong. We have no need for transactions either, and MyISAM is reasonable with high INSERT/high SELECT tables if you optimize the table regularly or have no DELETEs.

    December 15th, 2005 at 5:36 pm
  41. Peter,

    DELETEs are what killed us, yes. If you never do a delete, you should be OK.

    December 15th, 2005 at 6:02 pm
  42. Great reading here because I was almost about to commit to using the innodb engine. The downside is now I have trawl through the sources for postgresql.

    February 28th, 2006 at 9:00 pm
  43. Eloy adds this Comment:

    Hello all,

    We are currently evaluating the InnoDB engine under MySQL 5.0.18 for a 1 Terabyte insert-mostly occasional-delete table, using compact rows.

    Unfortunately we continously run into a problem crashing the new server, which is displayed in our /var/log/mysql/mysqld.err log:

    060313 09:33:47 mysqld started
    060313 9:33:48 InnoDB: Started; log sequence number 183 1523173388
    060313 9:33:48 [Note] /usr/sbin/mysqld-500: ready for connections.
    Version: ‘5.0.18-log’ socket: ‘/var/run/mysqld/mysqld.sock’ port: 3306 Gentoo Linux mysql-5.0.18-r30
    InnoDB: Database page corruption on disk or a failed
    InnoDB: file read of page 264004.
    InnoDB: You may have to recover from a backup.
    060313 9:55:22 InnoDB: Page dump in ascii and hex (16384 bytes):

    *** dump data omitted ***

    060313 9:55:22 InnoDB: Page checksum 3271725028, prior-to-4.0.14-form checksum 625759328
    InnoDB: stored checksum 1151023933, prior-to-4.0.14-form stored checksum 625759328
    InnoDB: Page lsn 2 2466360550, low 4 bytes of lsn at page end 2466360550
    InnoDB: Page number (if stored to page already) 264004,
    InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
    InnoDB: Page may be an index page where index id is 0 25
    InnoDB: (index GEN_CLUST_INDEX of table history/quotes)
    InnoDB: Database page corruption on disk or a failed
    InnoDB: file read of page 264004.
    InnoDB: You may have to recover from a backup.
    InnoDB: It is also possible that your operating
    InnoDB: system has corrupted its own file cache
    InnoDB: and rebooting your computer removes the
    InnoDB: error.
    InnoDB: If the corrupt page is an index page
    InnoDB: you can also try to fix the corruption
    InnoDB: by dumping, dropping, and reimporting
    InnoDB: the corrupt table. You can use CHECK
    InnoDB: TABLE to scan your table for corruption.
    InnoDB: See also http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
    InnoDB: about forcing recovery.
    InnoDB: Ending processing because of a corrupt database page.

    After which recovery starts.

    The server’s hardware seems to operate fine, i.e. no errors in any log whatsoever. The error can happen with different queries, but can be triggered by larger SELECTs, such as you would use to copy the table, or a CHECK TABLE.

    Odd enough, we have an older database running MySQL 4.0.18 with the same big table, and we never had a problem with InnoDB. Has anyone had similar problems?

    March 13th, 2006 at 9:17 am
  44. Stefan adds this Comment:

    I think I would suggest that you recreate the InnoDB tablespace with the 5.0.18 version.

    And import the data in 5.0.
    I know that is probably not that easy with 1TB of data.
    But as I read the output you posted you are running with a InnoDB tablespace created by an old 4.0.x version of MySQL.

    March 21st, 2006 at 7:08 am
  45. Eloy adds this Comment:

    The error is actually generated by InnoDB under MySQL 5.0.18. Preliminary testing on dissimilar hardware has shown that this error seems to be triggered by using a broadcom scsi raid controller in combination with InnoDB.

    And yes, it’s a pain to rewrite 1+ TB of data, especially when the tables are not exactly the same, so you can’t use mysqldump/mysqlimport ^^.

    March 27th, 2006 at 11:54 pm
  46. Fellow developer/entrepreneur adds this Comment:

    Hello Scott and Alex,

    First, I’d like to congratulate you guys on all your accomplishments to date on FL. Also, thanks for sharing information via the blog on the challenges you guys have faced and addressed while trying to develop and run your app/business.

    On that note, I am really curious to hear what the current state of your system performance is. I’m really interested in hearing whether you guys have stayed with PostgreSQL, what kind of configuration you are running to support your traffic, or any other bit of info you can provide on this topic.

    I’m a developer pursuing the entrepreneurial dream by developing a web app and am very interested in learning how popular db-backed websites are designed/configured to perform under their heavy load.

    Thanks.

    May 17th, 2006 at 3:01 pm
  47. Our system performance is quite good now. We’ve still using PostgreSQL for a lot of data, however we’ve moved to different solutions (memcache, etc.) in various places as well.

    May 20th, 2006 at 10:56 am
  48. zoran adds this Comment:

    Hey Guys,

    Great series of posts. I’m a big PostgreSQL fan, mainly for it’s advanced features. It’s amazing how much more sophisticated your applications can become and you begin to wonder how you ever did without them.

    I generally thought of it as being slower than MySQL for basic stuff but better under higher loads and more complex queries. It’s good to see you are seeing some performace advantages with it.

    My question is for those of you who have had some experience with the big boys (Oracle, DB2 and even Sybase) and am wondering how they perform compared to Postgres. I know it’s a difficult question, but I’m trying to work out at what point I might consider swapping over, or if in deed whether or not you think it’s necessary.

    Each of the vendors say they theirs is the best and can pull benchmarks to prove it out from somewhere. So it would be interesting if any of you could shed some light?

    ps. do any of you have any experience using Postgres for analytical processing / ad hoc queries, is it dumb to even consider it?

    Thanks,

    May 23rd, 2006 at 3:00 am
  49. Unique Custom Design Resources

    I couldn’t understand some parts of this article, but it sounds interesting

    August 16th, 2007 at 2:01 pm
  50. Programming Tutorials

    I couldn’t understand some parts of this article, but it sounds interesting

    August 25th, 2007 at 10:07 am
  51. Great plugin may be…

    August 27th, 2007 at 11:39 am
  52. Business Career Center

    I couldn’t understand some parts of this article, but it sounds interesting

    September 6th, 2007 at 4:59 am
  53. Freeware and Shareware Software Downloads

    I couldn’t understand some parts of this article, but it sounds interesting

    September 9th, 2007 at 4:29 am
  54. Software Development Guide

    I couldn’t understand some parts of this article, but it sounds interesting

    September 23rd, 2007 at 11:20 am
  55. Free Software Downloads

    I couldn’t understand some parts of this article, but it sounds interesting

    September 29th, 2007 at 9:42 pm
  56. Sports Illustrated

    I couldn’t understand some parts of this article, but it sounds interesting

    October 1st, 2007 at 4:57 am
  57. Customized Design Solutions

    I couldn’t understand some parts of this article, but it sounds interesting

    October 8th, 2007 at 1:12 pm
  58. Popular Science

    I couldn’t understand some parts of this article, but it sounds interesting

    October 26th, 2007 at 7:23 pm
  59. nextel ringtones nextel ringtones converter

    However poker online online cash advance

    December 7th, 2007 at 11:16 am
  60. download free ringtones for verizon wireless free ringtones maker download download free ringtones

    Why download free poker superstars instant cash advance online

    December 7th, 2007 at 12:14 pm
  61. free phone ringtones samsung sprint free sprint ringtones free polyphonic ringtones sprint

    Contact cash advance payday loan cingular co.blogspot.com ringtones

    December 7th, 2007 at 12:14 pm
  62. budget line cash advance

    As a summary of cash loan payday till loan payday uk

    December 17th, 2007 at 10:19 am
  63. 1 hour payday loan

    Join 30 day payday loan advance cash settlement

    December 17th, 2007 at 10:20 am
  64. approval instant loan payday

    Get 30 day payday loan payday loan in toronto

    December 17th, 2007 at 12:14 pm
  65. Thanks Nice site

    December 21st, 2007 at 7:51 am
  66. texas holdem starthände

    Near chase credit card payment advance card cash credit

    January 8th, 2008 at 1:51 am
  67. cash advance loan online advance cash loan online

    If cash advance loan online download ericsson free ringtones sony

    January 8th, 2008 at 4:41 am
  68. advance loan payday software? ?cash

    Think cash advance loan online cash advance loan with savings account

    January 8th, 2008 at 4:45 am
  69. advance loan payday software? ?cash

    Dear advance card cash credit cash loan payday till

    January 8th, 2008 at 4:45 am
  70. cash advance until pay day advance cash day pay

    Thus free cingular ringtones usa cash advance until pay day

    January 8th, 2008 at 9:40 am
  71. le supermarché casino en ligne

    As mentioned cash loan payday till la française des jeux

    January 8th, 2008 at 6:22 pm
  72. Essay2review.com has only one principal purpose: That is to help students like you with all your writing problems. Essay

    January 12th, 2008 at 12:21 am
  73. Rapid drug and alcohol detox centers, programs and rehabs for addiction treatment and substance abuse of heroin, opiates, prescription drugs, oxycontin, xanax, methadone and more . Detox

    January 12th, 2008 at 3:44 am
  74. Addiction Search is a comprehensive site dedicated to providing health consumers and professionals with up-to-date, research-based information on drug and alcohol addiction, addiction treatment approaches, and policy issues. Drug Addiction

    January 12th, 2008 at 3:45 am
  75. how to order steroids online

    Identification of inadequate testosterone in an aging male by symptoms alone can be difficult. DHT binds to the same androgen receptor even

    February 18th, 2008 at 2:05 am
  76. Stratumseind Eindhoven

    HOT TOPICS About Across the Sound Between the lines… Branded Entertainment Cause New Marketing Communal Marketing Consumer Central Consumer Generated Content Content is King Creativity Experiential Marketing Fatblogging Fixing the Ad Agency Mess From…

    March 25th, 2008 at 2:14 am
  77. Home Loans for UK Homeowners

    links to good articles on secured loans / homeowner loans

    April 26th, 2008 at 11:43 am
  78. how to sell a home fast

    Texas) the number of residential foreclosures increased 100% in the third quarter of this year. These are very sobering statistics when you think about 446

    May 11th, 2008 at 1:36 pm

Add a Comment or Trackback