Use of database transactions - best practise

Torsten_B · 22 August 2023 09:16

Transactions are used when data consistency matters, like in accounting and invoicing.
Shall transactions also be used for i.e. related customer data ([person], [address]), bills of material ([bill], [bill_item]) ?

What is best practise for the use of transactions?

thorstenstueker · 22 August 2023 10:26

best? Don’t do it with Xojo Okay, seriously: it differs a bit what you want to do and which database we are speaking about.

Jeannot · 22 August 2023 11:27

In my opinion, if you have multiple users working simultaneously, it is crucial to use transactions for writing purposes, particularly when writing to multiple tables. The only exception that comes to mind is when you are storing a significant amount of data (through an API, for example) and you are certain there will be no conflicts. However, even in such cases, I believe it is best to encapsulate commits within transactions, especially when utilizing parallelism, which is likely in the case of an API.

Most databases will automatically create an implicit transaction for you if you do not set one manually. However, this usually occurs per query, so the overall impact on performance might be worse if you do not use transactions proactively.

The question of “best practice” is, of course, philosophical. In my opinion, it is best to follow your actual business process. This involves defining your logical functional blocks and verifying them against your existing database structure, which should be done concurrently. Ideally, performance aspects should also be taken into account.

npalardy · 22 August 2023 13:30

I’d use them for any logically related data - invoice header + detail lines

It’s just a good habit to keep data consistent & reliable

Even in high performance situations I’ve used them to insert data (millions of rows a minute)

Jeannot · 22 August 2023 14:01

same here - you usually get more performance issues (talking about your own performance) if NOT using them .

npalardy · 22 August 2023 14:43

I use them all the time

IF you have performance issues its often NOT the case that the transactions are the issue
Bad indexes , too many indexes, etc can be far more painful than a transaction

Jeannot · 22 August 2023 16:13

Exactly, especially when you are not screening your queries and what the database engine/optimizer is doing with your SQL queries, for instance claiming that a full table scan is most likely the best approach for you, when it actually isn’t. Happens often when people are copying and pasting from the internet and AI will probably help us all in still earning money in the forseeable future

npalardy · 22 August 2023 16:54

Indeed

Or what worked fine with 100 rows, and maybe a few thousand is no longer great with millions of rows

Software, including DB’s, is a living thing and needs care, feeding, monitoring and updating as conditions change
Once you think “its done” and stop looking at it you fall behind VERY quickly then have piles of technical debt to fix eventually

thommcgrath · 22 August 2023 17:23

I use transactions for every write. In fact, I have my classes setup to not allow transactions on the main thread, and each write/transaction thread gets a new connection to the database. This means the main thread is read only and never potentially blocked by write operations.

Jeannot · 22 August 2023 17:48

yep, only way to ensure that you always use transactions when needed, is to always use them. The risk is just too high that otherwise you get sloppy.

npalardy · 22 August 2023 20:31

Since Xojo is cooperative threaded this might not matter that much across threads
its still a good practice

In something like Java, C#, Go etc where true preemptive multi threading happens its much more important

Xojo Workers, if I used them, should use transactions for all writes as well
Console helper apps should as well

ANY write should

thommcgrath · 22 August 2023 20:43

No, definitely not strictly necessary for Xojo. But I’ve definitely had much more reliable experiences keeping things segregated.

bgrommes · 29 August 2023 22:09

My current project is not transactional; it is essentially a data warehouse from which reporting happens. I create various entities in a specific sequence always and all the involved stored procs and the controlling code is idempotent; if there is a problem partway through, I can restart and everything works fine. So for speed, I mostly don’t use the overhead of transactions. An exception that comes to mind is swapping the primary display name of a business with one of its DBAs, which involves exchanging a type code on 2 name records. Since there can only be one primary name, I wrap those two updates in a transaction to make sure that they both happen or fail together.

Another performance hack is that 100% of my reads (queries, reports) are at an isolation level that allows “dirty reads”, as these are acceptable in our application. If a report is missing some data that is in the process of materializing it won’t look inconsistent and it won’t be inaccurate as far as what makes it onto the report. Effectively we have one process pumping records into the system and a bazillion users pulling reports out. Minimizing locks is a big help to keep the system capacity up.

In normal CRUD apps and accounting software and things like that, I agree with Norman, it’s a good idea to be in the habit of using transactions and enforcing consistency. But like every rule, it’s made to be selectively and intelligently broken.