pypika internally keeps copying query builder object because everything
is supposed to be immutable in pypika design, this however is terribly
slow. Often query generation takes more time than query execution.
This PR makes query builder mutable inside `get_query` function to avoid
copying while applying fields, filters, limit, order etc.
It's marked as immutable again when sending it back to users of the API.
This makes cache implementation uniform for all methods on db API. It's
weird that this specific method was caching in redis, which defies
expectations.
* fix: reduce bulk insert batch size
Back when this feature was added it used to lazily evaluate the input.
Now the iterator is consumed upfront so large batch sizes == huge memory usage.
* perf: bring back iterator for bulk_insert
Bulk insert used to support iterator for consuming arbitrarily large
amount of data and inserting it. Since child table support was added, it
can't do it anymore because that requires collecting values.
This change now brings back iterators by batching input iterator (by
default 1000) documents.
This is almost as good as original change from design POV. Performance
is still meh for flat documents.
* fix: Avoid Snapshot violation
- Main thread created and "read" user
- Other thread modified something
- Main thread wants to delete or "write" to same row.
This violates snapshot isolation.
* fix: treat snapshot violation as deadlock for now
* test: handle snapshot violations
(wal was already enabled in the file itself, no harm setting it here for consistency.
Reference: 45622c7d54
Co-authored-by: 18alantom <2.alan.tom@gmail.com>
Signed-off-by: Akhil Narang <me@akhilnarang.dev>
* feat: mysqlclient
* fix: update error attrs
* fix: decode mogrified query to unicode
* fix: do some cleanup
* chore: disable cleanup for now
* fix: remove unnecessary call to as_unicode
* test: skip perf test for now
* fix: fallback to empty str
* fix: unbuffered cursor support
* fix: update converters and other changes
* fix: add cleanup back
* perf: improve timedelta converter
* fix: dont attempt to run query when explain flag is set
* test: cleanup tests
* chore: remove commented code
* perf: store conf as local var
* chore: ensure sequence
---------
Co-authored-by: Ankush Menat <ankush@frappe.io>
* fix: always persist all indexes added via db.add_index
* fix: Add `if not exists` clause for index creation
This allows replica to have same index and master to add it later
without causing SQL error. Just minor DX benefit.
* fix(postgres): don't cache if table doesn't exist
* chore: revert postgres changes
Hopeless to maintain this
Certain tables contain A LOT of duplicate data, it makes sense to enable
compressed row format on them by default. I've seen 5-10 fold reduction
in DB size after enabling compressed format on select few tables.
This has some performance overhead:
- both compressed and uncompressed pages live in buffer pool.
- compression/decompression
Note:
- These cons don't apply much on DocTypes I am enabling this for.
- I am not enabling this on existing sites, migration can take a long
time! Do it manually with `transform-database` command if you want to.