Commit graph

48277 commits

Author SHA1 Message Date
Ankush Menat
8db2e1e2a6
fix: Avoid applying pinning optimization on RQ (#28896)
We only need it for gunicorn workers.
2024-12-24 09:41:18 +00:00
Akhil Narang
aa5455e281
Merge pull request #28386 from netchampfaris/invalid-encr-key-message
fix: better error message
2024-12-24 12:50:22 +05:30
Akhil Narang
7af83f6d37
Merge remote-tracking branch 'upstream/develop' into invalid-encr-key-message
* upstream/develop: (1373 commits)
  perf: cache dynamic links map in Redis (#28878)
  fix: Never query `flag_print_sql` in `developer_mode=0` (#28884)
  fix(restore): remove MariaDB view security definers
  fix: sanitize user input during setup wizard
  feat(sanitize_column): improve check
  refactor: make optimizations.py private entirely (#28872)
  fix(site_cache): site cache thread safety (#28870)
  chore(printview): change error message
  perf: speedup `frappe.call` by ~8x (#28866)
  test: reduce noise in test output (#28862)
  chore: spelling_invalid_values (#28858)
  fix: Remove misleading os.O_NONBLOCK flag (#28859)
  fix: string replacement in error logger
  perf(gthread): Pin web workers to a single core (#28854)
  fix: MariaDBDatabase.get_tables() should not query the entire database schema (#28846)
  fix: add strings and fields to translation
  fix: typo in test controller boilerplate
  perf: faster add_to_date (#28843)
  perf(version): Make get_versions fast for autoincrement doctypes (#28847)
  refactor: log in monitor as well
  ...
2024-12-24 12:36:33 +05:30
Akhil Narang
6b9960ca5c
chore: update message
Signed-off-by: Akhil Narang <me@akhilnarang.dev>
2024-12-24 12:35:19 +05:30
Akhil Narang
da5cfd9ad7
Merge pull request #28772 from mahsem/add_Swedish_date_and_time_format
fix: add Swedish date and time format
2024-12-24 12:33:55 +05:30
Akhil Narang
105a3b153d
Merge pull request #28513 from akhilnarang/fix-retry-background-job-after-job-hook
fix(background_jobs): init site if required for after_job hooks
2024-12-24 11:38:32 +05:30
Akhil Narang
c80cb525ea
Merge pull request #28876 from akhilnarang/improve-query-sanitization
feat(sanitize_column): improve check
2024-12-24 11:07:43 +05:30
Akhil Narang
6b8983cb91
Merge pull request #28877 from Sanket322/sanitize_input
fix: sanitize user input during setup wizard
2024-12-24 11:07:05 +05:30
Ankush Menat
3cb8a9e2e4
perf: cache dynamic links map in Redis (#28878)
Note about correctness: Once site has seen enough usage this map will rarely change. So the
problem of "cache inconsistency" is very rare, still care is taken to
avoid possible cache inconsistencies.
2024-12-23 19:43:05 +05:30
Ankush Menat
4f628ca091
fix: Never query flag_print_sql in developer_mode=0 (#28884)
Unnecessary overhead and need to disable this everytime I want to get
realistic performance numbers out.

All the performance affecting toggles should be directly controlled by
just `developer_mode` alone.
2024-12-23 13:57:01 +00:00
Akhil Narang
f4407c84c7
Merge pull request #28879 from akhilnarang/strip-view-security-definer
fix(restore): remove MariaDB view security definers
2024-12-23 17:44:01 +05:30
Akhil Narang
098d4896e3
fix(restore): remove MariaDB view security definers
Signed-off-by: Akhil Narang <me@akhilnarang.dev>
2024-12-23 17:23:17 +05:30
Sanket322
b119513dc1 fix: sanitize user input during setup wizard 2024-12-23 16:32:28 +05:30
Akhil Narang
b5bad56cdd
feat(sanitize_column): improve check
Signed-off-by: Akhil Narang <me@akhilnarang.dev>
2024-12-23 16:16:08 +05:30
Akhil Narang
21a6d2a717
Merge pull request #28868 from akhilnarang/printview-cleanup-checks
chore(printview): change error message
2024-12-23 15:27:54 +05:30
Ankush Menat
fe63af5449
refactor: make optimizations.py private entirely (#28872)
Avoids having to prefix everything with `_`.
2024-12-23 09:56:56 +00:00
Ankush Menat
17686eba3b
fix(site_cache): site cache thread safety (#28870)
Identified two cases where site cache can break:

1. Other thread clears cache using clear_cache because of TTL or manual
   eviction.
2. Other thread pops the eliment we are about to read because of
   `maxsize` limit.

This change should fix both and even make it lil bit faster.
2024-12-23 13:44:19 +05:30
Akhil Narang
5a4239fbe3
chore(printview): change error message
Signed-off-by: Akhil Narang <me@akhilnarang.dev>
2024-12-23 13:29:44 +05:30
Ankush Menat
197a49cf27
perf: speedup frappe.call by ~8x (#28866)
Before: 8.81us
After: 1.1us

Benchmarks in caffeine repo
2024-12-23 06:41:20 +00:00
Akhil Narang
076a8fdd1a
Merge pull request #28855 from mahsem/add_strings_and_fields_to_translation
fix: add strings and fields to translation
2024-12-23 11:47:45 +05:30
Ankush Menat
7d4d6b59df
test: reduce noise in test output (#28862)
* chore: remove verbose output from test runner

This is same output that's shared by test runner in different format?

This makes it annoying to scroll through when just running single test
locally.

* fix: Remove clutter from test output

Test records don't change after first run.
Tests are executed many many times locally

* test: retry flaky postgres backup tests
2024-12-23 06:11:47 +00:00
mahsem
dd8b353caa
chore: spelling_invalid_values (#28858) 2024-12-23 11:02:20 +05:30
Ankush Menat
e85aa44843
fix: Remove misleading os.O_NONBLOCK flag (#28859)
This works likes this in C `open(2)` on file descriptor, not in python :)

In python it's setting buffering to enum value which in this case is
2048, if it were lower number this would've made performance worse.

ref:
- https://man7.org/linux/man-pages/man2/open.2.html#DESCRIPTION
- https://docs.python.org/3/library/functions.html#open
2024-12-23 05:22:13 +00:00
Md Hussain Nagaria
df7c8b1f88
Merge pull request #28853 from frappe/fix-boilerplate-typo 2024-12-23 07:02:57 +05:30
Ankush Menat
ca192fe208 fix: string replacement in error logger 2024-12-22 18:19:32 +05:30
Ankush Menat
d466578348
perf(gthread): Pin web workers to a single core (#28854)
Python's multithreaded model is _inefficient_ because of Global
Interpreter Lock (GIL). Any one thread of process can run at any given
time. Thus only valid use case for threads in Python are:

1. Hiding I/O latency by switching to a different thread.
2. Using compiled extensions that yield GIL for long enough time to do
   meaningful work in other threads.

Both of these are not as frequent as you'd imagine and gthread worker
with multiple threads often just end up contending on lock and waste
useful CPU cycles doing nothing. Pinning worker process to a core nearly
eliminates this contention wastage. This waste can be 5-10% and goes up
sharply with more threads.

E.g. FC typically has maxed out config of 24 workers which allows
"accepting" and working on 24 requests at a time. But that doesn't mean
24 requests are on CPU at any given time, that would require 24 physical
cores.

Why do this?

1. Context switching in threads is faster than switching process - fewer
   cache misses, fewer TLB misses etc.
2. The model is simple
    True parallelism = count(cores) = count(processes).
    Expected concurrency = count(processes) * count(threads).
3. This is far simpler to reason about than something like async
   executor model.
4. Ability to queue more requests than what can be handled is already
   implemented by `bind(2)` and `accept(2)` in kernel. There is no real
   benefit of accepting 1000 requests if you can only work on 20 of them
   at a time. This is because we do a lot of "work" in requests, it's
   not just issuing an external request and waiting for it.
5. We can achieve practically same concurrency as 24 workers with 4
   process x 6 threads. That's a lot of memory saved to run other useful
   things.

Caveats:
- This kind of pinning can potentially make Linux scheduler inefficient.
  I don't quite think it's going to be a big problem because there are
  plenty of other things to run which a core can steal from other core
  if it doesn't have enough work.
- Load balancing in single-server multi-bench setup. I *think* by nature
  of how `accept(2)` works, load balancing will still happen pretty much
  automatically. If certain core is overloaded, naturally other cores
  will reach `accept(2)` more frequently and take the load off of that
  core. This is something worth validating in practice by creating
  skewed affinities.
- This code is not NUMA-aware. None of our machines have NUMA nodes so,
  I am ignoring it. Don't use it if you have a NUMA setup.
- If new CPUs are hotplugged or existing ones are disabled then it can
  be inefficient (worse than current) until that worker auto-restarts (which
  happens after N requests in FC setup).

Ideal solution: We write userspace scheduler to implement
"soft-affinity" using Linux's new eBPF based sched_ext feature. That's
too much extra work but I'll consider this too at some point.

closes https://github.com/frappe/caffeine/issues/13
2024-12-22 16:59:13 +05:30
Brian Pond
61a16f399e
fix: MariaDBDatabase.get_tables() should not query the entire database schema (#28846) 2024-12-22 13:45:23 +05:30
mahsem
e8698a98de
fix: add strings and fields to translation 2024-12-21 13:17:01 +01:00
Hussain Nagaria
673269fdcf fix: typo in test controller boilerplate 2024-12-21 13:13:22 +05:30
Ankush Menat
f243aa1942
perf: faster add_to_date (#28843)
* fix(DX): Accept None directly as input to add_to_date

It's already supported, signature is just not updated.

* perf: use efficient date parser
2024-12-21 05:23:34 +00:00
Balamurali M
773bf810af
perf(version): Make get_versions fast for autoincrement doctypes (#28847)
Since docname is varchar, indexes won't work when int as passed as
value.
2024-12-21 10:41:32 +05:30
Akhil Narang
bbf1abf3d3
Merge pull request #28845 from ruthra-kumar/log_memory_usage
feat: log peak memory usage for Prepared reports
2024-12-20 16:24:18 +05:30
ruthra kumar
63a6c8c903 refactor: log in monitor as well 2024-12-20 11:35:41 +05:30
ruthra kumar
29cf8cef8f feat: log peak memory usage for Prepared reports 2024-12-20 10:18:27 +05:30
Ankush Menat
9e9096834f
perf: pretty_date - avoid useless dt->string->dt cycle (#28842) 2024-12-19 14:53:38 +00:00
Ankush Menat
17cc356915
perf: speed up flt by 1.06x and get_system_settings by 1.32x (#28841)
* perf: resolve rounding method once

When rounding method is explcitly specified it's 1.4x faster.

* perf: reorder checks

Bankers rounding is default and most common now

* perf: speedup get_system_settings
2024-12-19 14:38:45 +00:00
Ankush Menat
1c2f8abb4e
perf: speedup get_datetime by ~9.5x (#28840) 2024-12-19 20:01:45 +05:30
Ankush Menat
9419344c76
fix: always print tracebacks (#28838)
* fix: fallback for always printing tracebacks

I don't recall ever hitting "no" to this prompt. It's of no use for me.

Also, this makes automated scripts not really automated.

* revert: prompting for exceptions

Always print full exception
2024-12-19 12:46:11 +00:00
Ankush Menat
9e8ab92371
refactor: move all optimizations and pre/post fork hooks to separate file (#28832)
Now they will truly execute before/after fork = :pinch: few bytes saved!
2024-12-19 16:46:26 +05:30
Akhil Narang
a560ba27e4
Merge pull request #28820 from rutwikhdev/fix-workflow-builder
fix: find workflow by id instead of state name
2024-12-19 16:45:10 +05:30
Akhil Narang
495d21240e
Merge pull request #28834 from akhilnarang/fix-symlink-different-filesystem
fix(build): `os.replace` -> `shutil.move`
2024-12-19 16:39:47 +05:30
Akhil Narang
6c32b79766
fix(build): os.replace -> shutil.move
In some cases, while running in docker, we end up with:

```
[Errno 18] Invalid cross-device link: 'tmp<hash>' -> './assets/frappe'
```

Using `shutil.move` fixes this as it supports different filesystems, `os.replace` doesn't

Signed-off-by: Akhil Narang <me@akhilnarang.dev>
2024-12-19 16:12:48 +05:30
Ankush Menat
ae434dabfe
Merge pull request #28825 from frappe/fix-build
fix: esbuild with cached flag
2024-12-18 19:32:11 +05:30
Ankush Menat
de4e037246
Merge pull request #28827 from ankush/default_document_cache_expiry
fix: default document cache expiry + monitor persistence
2024-12-18 19:31:02 +05:30
Ankush Menat
8df9d3acdd fix: protect monitor logs from cache eviction 2024-12-18 19:13:02 +05:30
Hussain Nagaria
e980de2788 fix(esbuild): bug that caused apps to json to not get updated when --using-cached 2024-12-18 18:16:47 +05:30
Ankush Menat
6040145109 fix: Set some expiry for cached documents
IMO 1 cache miss per document is fine. This at least ensure that a
missed-invalidation won't cause a perpetual problem.
2024-12-18 17:26:50 +05:30
Ankush Menat
004990e53e
perf: Make frappe._dict great again (#28824)
* perf: Restore dict's flat overrides

Using `super()` is unnecessary cost. This class is used A LOT. Ref: https://github.com/frappe/frappe/pull/16449/

Please consider performance while adding types, it's almost always possible to achieve good typing without this.

Also `frappe._dict` is almost always used as `dict[Any, Any]` or
`dict[str, Any]`, type annotations are useless here.

* ci: ugh wait for processes to exit
2024-12-18 16:36:31 +05:30
Ankush Menat
7dd15e3613
perf: speedup pickling of document objects (#28823)
* perf: Use latest pickle protocol

* perf: pop flags from cached documents

This is also the right thing to do, things like `doc.flags.for_update`
shouldn't be "cached".
2024-12-18 10:18:04 +00:00
Ankush Menat
9d9193800b
fix: Keep HTTP caches private by default (#28719)
Developers can easily enable `can_cache` without knowing what it
entails. Public cache means proxy can likely cache things without
talking to backend.

Obviously many endpoints which can be cached on client side should
probably not be cached in proxy.

E.g. linked PR to the PR that added this feature suggest caching
notification log for short time... we don't want to leak one user's
cached notification to another user.

I don't buy that developers should know about cache implementation to
ensure it's secure or correct to enable it on certain endpoint. In
addition to that, we have very few mechanisms to burst cache
inside proxy. End user hitting ctrl+shift+r won't do anything if proxy
wants to serve stale response.

We should figure out better way to instruct FW about final cache
control headers than hardcoding it IMO.
2024-12-18 14:57:51 +05:30