Writing by corra on Tuesday, 27 of November , 2007 at 11:45 pm
Stasera sono tornato eccezionalmente ad una mansione che mi era propria fino al lontano 1998/99. Ho fatto il web editor, addirittura per il sito inglese, per l’assenza forzata dell’ottimo Stewart che č in volo verso l’estremo oriente.
Non sorprendetevi se la cronaca della partita č risultata pių scarna del solito; io con un po’ di copia-incolla ho cercato di fare del mio meglio.
Tutto sommato č stato divertente.
Writing by corra on Tuesday, 27 of November , 2007 at 3:09 pm
The optimizer is the part of the mysql server whose responsibility is to discover the optimal plan (the execution oath) to solve an SQL query.
When dealing with joins there exist various execution paths depending on the number of tables involved. The number of possible plans to investigates grows exponentially with the number of tables. This is not a problem for queries with few tables (less than 10) but could be a serious problem when the number of tables climbs.
When dealing with dozens of tables the time spent to evaluate all possible plans to solve the query is longer than the execution of the query itself. Not good.
Starting from version 5.0.1 MySQL introduces a flexible method to avoid the possible bottleneck when you have lot of tables joined in a query.
Using the system variable optimizer_prune_level (possible values are 0 or 1) you can tell the optimizer to skip certain plans from being inspected based on the evaluated number of rows involved for every table. The algorithm is heuristic. The default value is 1 (enabled) and this means that a limited number of execution plans are evaluated instead of all possibile combination. This leads to a significant increase of performance when dealing with multi-table join queries.
Setting optimizer_prune_level=0 leads to a much longer query compilation. This is not a problem with few tables but don’t use it if not.
Another useful system variable is optimizer_search_depth. It tells the optimizer how far in the future to expand an incomplete plan. Also in this case the optimizer is able to compile the query faster. As the value of this variable is close to the number of tables involved in a join, the compile time is much longer.
Setting this variable to zero means that the server will determine tha optimal value for every query.
As in the previous case, this variable has negligible effect with few table joins and usually the value set to zero is the best solution.
Surely, using these system variables leads to better performance, but could be possibile that the optimizer fails to find to optimal solution. If you think that the optimizer is failing you can help it forcing or avoiding the use of certain index (FORCE INDEX, IGNORE INDEX) or maybe forcing an order to read the tables (STRAIGHT_JOIN). This are issues left to you and the knowledge of your tables and indexes.
Writing by corra on Wednesday, 21 of November , 2007 at 1:10 pm
When dealing with multicolumn indexes the order of the columns in the index definition is very important. Fpr example if you have such an index
INDEX (a,b,c)
and you issue a query such
... WHERE b>1000;
MySQL will not be able to use the index. In case ol multicolumn indexes MySQL will always use the left-most part of the index. So, the index will be used for example if the query would be:
... WHERE a>100 AND b>1000;
So, MySQL will use the index only for search conditions that involve the follwing columns:
The mulitcolumn indexes are also useful in solving ORDER BY and GROUP BY operations.
Let’s see the folliwing example.
Here is my customer table (for sake of simplicity it’s just a subset of my real table).
corra@localhost> show create table customer\G
*************************** 1. row ***************************
Table: customer
Create Table: CREATE TABLE `customer` (
`idcustomer` mediumint(8) unsigned NOT NULL default '0',
`name` varchar(40) default NULL,
`surname` varchar(40) default NULL,
`email` varchar(60) default NULL,
`city` varchar(50) default NULL,
PRIMARY KEY (`idcustomer`),
KEY `name` (`name`,`surname`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
the table has 350000 rows and, as you can see, there’s a multicolumn index `name`,`surname`.
Let’s have a look to the EXPLAIN of a query selecting alla the customer whose name starts with A with an ORDER BY clause on the columns involved in the multicolumn index.
corra@localhost> explain select * from customer where name like 'a%' order by name, surname\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: range
possible_keys: name
key: name
key_len: 43
ref: NULL
rows: 41971
Extra: Using where
Take a look to Extra field, and now try the query inverting the columns in the ORDER BY clause:
corra@localhost[archivio]> explain select * from customer where name like ‘a%’ order by surname, name\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: range
possible_keys: name
key: name
key_len: 43
ref: NULL
rows: 41971
Extra: Using where; Using filesort
The difference is that now MySQL have to do an extra work to sort the results; the only difference in the EXPLAIN output is the Using filesort value in the Extra field. MySQL can’t use the index to sort the results because the order of the columns is not the same as the index definition. The query is slower then the previous one and the problem will be more serious increasing the table size.
Remember this issues when dealing with multicolumn indexes.
Writing by corra on Tuesday, 20 of November , 2007 at 12:12 am
Today I answered to a problem regarding fulltext indexes on an italian newsgroup. The guy was in trouble in building a multicolumn fulltext index. MySQL always said that a column cannot be part of the index. Why?
Remember that all the columns in a fulltext index must have the same charset and the same collation.
Let’s try if it is true!
Create a sample news table and try to build the fulltext index on (news_title, news_text):
mysql> CREATE TABLE news(
-> id INT auto_increment,
-> news_title VARCHAR(100),
-> news_text TEXT,
-> PRIMARY KEY(id))
-> CHARSET latin1 COLLATE latin1_general_ci;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE FULLTEXT INDEX ft_idx ON news(news_title,news_text);
Query OK, 0 rows affected (0.05 sec)
Records: 0 Duplicates: 0 Warnings: 0
Great, it works!
And now, change the charset of one of the field and try again:
mysql> ALTER TABLE news DROP INDEX ft_idx;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE news MODIFY COLUMN news_title VARCHAR(100) CHARSET utf8;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> CREATE FULLTEXT INDEX ft_idx ON news(news_title,news_text);
ERROR 1283 (HY000): Column 'news_text' cannot be part of FULLTEXT index
And now, try to change the collation of one of fields:
(but first change the charset to the previous one)
mysql> ALTER TABLE news MODIFY COLUMN news_title VARCHAR(100);
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE news MODIFY COLUMN news_text TEXT COLLATE latin1_swedish_ci;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> CREATE FULLTEXT INDEX ft_idx ON news(news_title,news_text);
ERROR 1283 (HY000): Column 'news_text' cannot be part of FULLTEXT index
Same charsets and collations on a multicolumn fulltext index. That’s all.
Writing by corra on Friday, 16 of November , 2007 at 7:38 pm
La settimana č stata bella piena. Ci sono scadenze che si stanno pericolosamente avvicinando e come al solito sto tirando le fila di pių progetti tutti insieme.
Il multitasking č una bella invenzione, per i computer, ma essere tu stesso la risorsa “shared” un po’ fa sbarellare il cervello.
Ovviamente tutti i progetti in corso sono ugualmente importanti e complessi a vari livelli.
Insomma, prevendo delle belle settimane da qui a Natale.