Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No fast-track for bulk inserts in Cursor.executemany with INSERT/REPLACE syntax introduced in MySQL 8.0.19 #968

Open
1 task done
hunyadi opened this issue Sep 24, 2023 · 0 comments
Labels

Comments

@hunyadi
Copy link

hunyadi commented Sep 24, 2023

Describe the bug

When calling Cursor.executemany with an INSERT or REPLACE SQL statement, aiomysql compares the statement against a regular expression RE_INSERT_VALUES and if there is a match, a different, fast-track execution path is taken as opposed to expanding executemany into a series of execute statements.

Unfortunately, the regular expression tests against the older INSERT or REPLACE syntax in MySQL versions prior to 8.0.19 only. For example, the following is a match:

INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6)
  ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

This causes a warning message to be emitted with MySQL 8.0.20 and later:

/usr/local/lib/python3.11/site-packages/aiomysql/cursors.py:239: Warning: 'VALUES function' is deprecated and will be removed in a future release. Please use an alias (INSERT INTO ... VALUES (...) AS alias) and replace VALUES(col) in the ON DUPLICATE KEY UPDATE clause with alias.col instead

However, if the new recommended syntax is adopted, the fast-track course is not longer chosen, and execution significantly slows down. This is because the new syntax is no longer a match for RE_INSERT_VALUES:

INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6) AS new(m,n,p)
  ON DUPLICATE KEY UPDATE c = m+n;

On the contrary, execution speed is restored if we slightly modify the regular expression that SQL statements are tested against:

RE_INSERT_VALUES = re.compile(
    r"\s*((?:INSERT|REPLACE)\s.+\sVALUES?\s+)"
    + r"(\(\s*(?:%s|%\(.+\)s)\s*(?:,\s*(?:%s|%\(.+\)s)\s*)*\))"
    + r"(\s*(?:(?:AS|ON DUPLICATE).*)?);?\s*\Z",
    re.IGNORECASE | re.DOTALL,
)

This will make both old-style and new-style syntax pass.

To Reproduce

Try a bulk INSERT statement with the old syntax:

INSERT INTO "DataTable"
("id", "data") VALUES (%s, %s)
ON DUPLICATE KEY UPDATE
"data" = VALUES("data")

A warning message is emitted in MySQL 8.0.20 and later.

Try a bulk INSERT statement with the new syntax (MySQL 8.0.19 and later):

INSERT INTO "DataTable"
("id", "data") VALUES (%s, %s) AS EXCLUDED
ON DUPLICATE KEY UPDATE
"data" = EXCLUDED."data"

Execution significantly slows down.

Expected behavior

Execution speed does not diminish when using the new MySQL 8.0.20 syntax for INSERT.

Logs/tracebacks

n/a

Python Version

Python 3.11.5

aiomysql Version

Name: aiomysql
Version: 0.2.0
Summary: MySQL driver for asyncio.
Home-page: https://github.com/aio-libs/aiomysql
Author: Nikolay Novik
Author-email: [email protected]
License: MIT
Location: /usr/local/lib/python3.11/site-packages
Requires: PyMySQL
Required-by:

PyMySQL Version

Name: PyMySQL
Version: 1.1.0
Summary: Pure Python MySQL Driver
Home-page: 
Author: 
Author-email: Inada Naoki <[email protected]>, Yutaka Matsubara <[email protected]>
License: MIT License
Location: /usr/local/lib/python3.11/site-packages
Requires: 
Required-by: aiomysql

SQLAlchemy Version

No response

OS

macOS 13.5.2

Database type and version

MySQL 8.1.0

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@hunyadi hunyadi added the bug label Sep 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant