Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hope to support oceanbase databases #563

Closed
deskau opened this issue Jun 19, 2023 · 24 comments · Fixed by #670
Closed

Hope to support oceanbase databases #563

deskau opened this issue Jun 19, 2023 · 24 comments · Fixed by #670
Assignees
Labels
enhancement New feature or request

Comments

@deskau
Copy link

deskau commented Jun 19, 2023

The oceanbase database has two modes: MySQL and Oracle, which are mostly syntax compatible with both MySQL and Oracle. Currently, when testing the oceanbase MySQL mode, there are some issues that we hope to support.

  1. For example, in src/mysql/mysqloltp.tcl, the name is timestamp. In the stored procedure required to create a test, there is an input parameter named timestamp. However, in oceanbase, timestamp is a reserved keyword, so creating this stored procedure will result in an error. The current solution is to manually add 'timestamp' for each translation。
  2. Because oceanbase does not support the MySQL statistical information collection “analyze table” command, the syntax for reporting errors in buildschema is not supported: it is also necessary to annotate the three lines 488,489,490 in the src/mysql/mysqloltp.tcl file to avoid errors。
@sm-shaw
Copy link
Contributor

sm-shaw commented Jun 19, 2023

Hi @deskau - yes this should definitely be possible and something done for other databases by adding a compatible parameter.
So for example in PostgreSQL we have an Oracle compatible (For EnterpriseDB Advanced Server) and Citus https://github.com/citusdata/citus compatible options that were submitted.
In the dict you see the following:

hammerdb>print dict
Dictionary Settings for PostgreSQL
..
tpcc       {
...
 pg_oracompat        = false
 pg_cituscompat      = false
...
}

and this is also shown in the GUI which also sets the correct options such as the use of stored procedures
image

Then where needed, so in the build, driver (when modifying the driver changes need to go in multiple sections for the different options such as asynchronous) or both you pass the compatible variable and use this in the script to change the behaviour as required, for example:

   if { $citus_compatible eq "true" } {
            set sql(7) { SELECT create_distributed_function('dbms_random(int,int)') }
            set sql(8) { SELECT create_distributed_function(oid, '$1', colocate_with:='warehouse') FROM pg_catalog.pg_proc WHERE proname IN ('neword', 'delivery', 'payment', 'ostat', 'slev') }
        }

so if this is of interest then looking at the examples above should be able to help you find out what to do.

Then you submit a pull request with your changes, for example this was submitted for the Citus Changes #302 (you wouldn't need the additional formatting that was done here) that works for you.

We can test to make sure everything is still OK with the default MySQL build and test.

Then this will add the Oceanbase compatible option you want which the user can enable by selecting a checkbox.

Feel free to ask for further help or guidance (for example on updating the GUI options).

@deskau
Copy link
Author

deskau commented Jun 21, 2023

image
Thank you, there's another question. Show global status where Variable_ Name='Com_ Commit 'or Variable_ Name='Com_ Roll back '. Is this SQL statement necessary? Because this sentence cannot be queried in the MySQL mode of the oceanbase database, there is no variable $com during the test run process_ Comm and $com_ The value of roll

@sm-shaw
Copy link
Contributor

sm-shaw commented Jun 21, 2023

Yes, this SQL statement is necessary, this gives you the TPM value - so your transactions are the user commits + user rollbacks, taken over the duration of the test this is then used to calculate transactions per minute.

Note that the transaction counter (in both GUI and CLI) also uses the same statement to get the transaction data for MySQL/MariaDB. Therefore, if this statement doesn't work in the database you want to make compatible, then you need to find a replacement.

It is also important to note that we query TPM for transaction counter as this is typically an in-memory table so sampling it will not impact the test performance, we never query NOPM during the test for this reason and only capture this at the start and end. You should have an equivalent you can use with a modified SQL statement to get the commits + rollbacks.

Also, if it is necessary to change the driver script, there are multiple sections that need modifying for the different options the user has chosen, so you need to find this statement in all sections of the file mysqloltp.tcl file you are editing, not just the first one.

@deskau
Copy link
Author

deskau commented Jul 19, 2023

RUN HAMMERDB TEST;
When running the test, there was an error. Which code here cannot adapt. I hope you can help me, thank you

image

Error in Virtual User 2: mysqlexec/db server: Commands out of sync; you can't run this command now
Vuser 2:FINISHED FAILED
Error in Virtual User 6: mysqlexec/db server: Commands out of sync; you can't run this command now
Vuser 6:FINISHED FAILED

@sm-shaw
Copy link
Contributor

sm-shaw commented Jul 19, 2023

It could be for a number of reasons. Most likely is a syntax error such as having a semicolon terminating the SQL you are running from HammerDB"

@deskau
Copy link
Author

deskau commented Jul 21, 2023

Which file in src/mysql is being run in this step,Performed tpcc testing
image

@sm-shaw
Copy link
Contributor

sm-shaw commented Jul 21, 2023

mysqloltp.tcl contains the code that is run for the OLTP / tpcc derived testing.

As an example, this would give the error you see (with semi-colon)

if {[catch {set start_nopm [ list [ mysql::sel $mysql_handler "select sum(d_next_o_id) from district;" -list ] ]}]} {

and this would not

if {[catch {set start_nopm [ list [ mysql::sel $mysql_handler "select sum(d_next_o_id) from district" -list ] ]}]} {

@deskau
Copy link
Author

deskau commented Jul 21, 2023

No already; End SQL. Need to find out which SQL is incompatible

@sm-shaw
Copy link
Contributor

sm-shaw commented Jul 21, 2023

It is difficult to provide guidance here without more information on the problems you are encountering.

@sm-shaw
Copy link
Contributor

sm-shaw commented Aug 2, 2023

@deskau allow me to introduce @Jiang-Hua - @Jiang-Hua is the leading HammerDB developer in your region and has added a number of features for open source databases such as the PostgreSQL metrics cbf83e3
If you can connect locally then @Jiang-Hua will be able to help you.

@sm-shaw sm-shaw added the enhancement New feature or request label Aug 2, 2023
@Jiang-Hua
Copy link
Contributor

Hello @deskau, nice to see your attempt to support OceanBase on HammerDB. I am in Beijing, and you can contact me through github, email (hua.h.jiang@intel.com), or phone (18612278445) to let me know your progress. We can work together to enable HammerDB to support OceanBase.

@sm-shaw sm-shaw assigned sm-shaw, Jiang-Hua and deskau and unassigned sm-shaw Aug 3, 2023
@deskau
Copy link
Author

deskau commented Aug 4, 2023

Thank you for your support. Due to unsuccessful adaptation. I have temporarily given up. I will continue to study when I have free time, and please guide me then.

@sm-shaw
Copy link
Contributor

sm-shaw commented Aug 4, 2023

If you can provide details on the unsuccessful adaptation either here or directly to @Jiang-Hua then we may be able to resolve the issue before you return to it.

@Jiang-Hua
Copy link
Contributor

I will do some evaluation in my OceanBase environment, and any progress will be updated here.

@deskau
Copy link
Author

deskau commented Aug 9, 2023

This is a good thing. If you encounter any problems, I hope you can share and discuss them together. I have pointed out the problem I encountered above, as I previously used the Docker environment. The command used is/Scripts/Python/mysql/tprocc/mysql_ tprocc_ py.sh

@xtangxtang
Copy link
Contributor

Hello @deskau @sm-shaw I will continue the OceanBase integration work. You can contact me through my email (xi.tang@intel.com). I am also in Beijing and submitted two PRs in OceanBase. If any updates, I will post here.

@xtangxtang
Copy link
Contributor

Hi @deskau, When I integrate the Oceanbase into HammderDB and run build schema, there is a "Cannot add foreign key constraint" error if run "CREATE TABLE LINEITEM (
L_SHIPDATE DATE NULL,
L_ORDERKEY BIGINT NOT NULL,
L_DISCOUNT DECIMAL(10,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(10,2) NOT NULL,
L_SUPPKEY INT NOT NULL,
L_QUANTITY INT NOT NULL,
L_RETURNFLAG CHAR(1) BINARY NULL,
L_PARTKEY INT NOT NULL,
L_LINESTATUS CHAR(1) BINARY NULL,
L_TAX DECIMAL(10,2) NOT NULL,
L_COMMITDATE DATE NULL,
L_RECEIPTDATE DATE NULL,
L_SHIPMODE CHAR(10) BINARY NULL,
L_LINENUMBER INT NOT NULL,
L_SHIPINSTRUCT CHAR(25) BINARY NULL,
L_COMMENT VARCHAR(44) BINARY NULL,
PRIMARY KEY (L_ORDERKEY, L_LINENUMBER),
INDEX LINEITEM_PART_SUPP_FKIDX (L_PARTKEY,L_SUPPKEY),
INDEX IDX_LINEITEM_ORDERKEY_FKIDX (L_ORDERKEY),
FOREIGN KEY LINEITEM_FK1(L_ORDERKEY) REFERENCES ORDERS(O_ORDERKEY),
FOREIGN KEY LINEITEM_FK2(L_PARTKEY, L_SUPPKEY) REFERENCES PARTSUPP(PS_PARTKEY, PS_SUPPKEY)"

@deskau is there any contrain for Oceanbase to add Foreign Key? Other tables can be created, only lineitem report error on Onceanbase.

@deskau
Copy link
Author

deskau commented Aug 30, 2023

Can you provide 2 dependent external tables?I'll test it locally

@sm-shaw
Copy link
Contributor

sm-shaw commented Aug 30, 2023

The LINEITEM table is in the TPROC-H schema i.e. the analytic data warehouse workload. There are no Foreign keys defined for the TPROC-C schema i.e. the OLTP schema.

For TPROC-H this is derived from the TPC-H specification that says: "The use of constraints is optional and limited to primary key, foreign key, check, and not null constraints." so you do not have to define foreign keys if you do not want to or do not support them.

However, and @Jiang-Hua may be able to provide more information here, using PostgreSQL just as an example (and I appreciate that we are looking at a MySQL derived schema here) we looked into implementing partitioning to improve TPROC-H performance. Doing this it was found (for PostgreSQL) the partition key had to be added to the primary key, but this then meant it was not possible to create the foreign key. Testing without creating the foreign key meant performance was lower (so partitioning has not been implemented for PostgreSQL).

So, there is scope for testing here to decide what you want to do. You do not have to create the foreign keys for TPROC-H, but this may then impact performance. TPROC-C as above does not have foreign keys defined.

@xtangxtang
Copy link
Contributor

xtangxtang commented Aug 31, 2023

@deskau I have checked, it seems that it is the line "FOREIGN KEY LINEITEM_FK1(L_ORDERKEY) REFERENCES ORDERS(O_ORDERKEY)" to report the error. For all the creating table SQL you can check mysqlolap.tcl file (started from line 103).

@sm-shaw Thanks for your clarification. So if Oceanbase has any constraints for the foreign key, I will write an Oceanbase version TPROC-H creating table SQL which do not create the foreign keys.

@sm-shaw
Copy link
Contributor

sm-shaw commented Aug 31, 2023

There are a couple of things to note before you do this.

Firstly, the TPROC-H schema is focused on testing analytic capabilities with ad-hoc complex queries. There are 2 typical approaches to high performance queries, parallel query and columnstores. The databases that HammerDB supports have either/or both. MySQL does not support analytics as it does not have parallel query or a columnstore. MariaDB does have an open source columnstore. So you need to ensure that Oceanbase has the capabilities for this type of workload before enabling it, otherwise single-threaded queries will be long-running. Also note that nearly all columnstores have much higher load times than row-oriented databases, so usually you need to bulk load them manually.

Secondly, just removing the foreign keys because they are not supported may not be a good idea. The foreign keys can significantly benefit the performance of joins between the parent and child tables of which there are many in the TPROC-H workload. (@Jiang-Hua confirmed this with PostgreSQL when implementing partitioning). So this is something that needs to be tested just before removing them.

Depending on the capabilities of the storage engine you are using with Oceanbase it could be better starting with the TPROC-C/OLTP workload.

Also, you should consider how different Oceanbase is from MySQL as there are 2 options here as well. For some changes as above if it works with the regular MySQL schema and minor changes then adding an "Oceanbase" compatible checkbox is the way to go and the default choice.

As an alternative, MariaDB for example was considered different enough from MySQL to be added as a separate database to HammerDB with its own interface etc. The main consideration here is ongoing support and the ability to test new changes and features. Therefore, if this is preferred, it may be another direction to take aside from the default if the goal is more flexibility to make changes.

@xtangxtang
Copy link
Contributor

There are a couple of things to note before you do this.

Firstly, the TPROC-H schema is focused on testing analytic capabilities with ad-hoc complex queries. There are 2 typical approaches to high performance queries, parallel query and columnstores. The databases that HammerDB supports have either/or both. MySQL does not support analytics as it does not have parallel query or a columnstore. MariaDB does have an open source columnstore. So you need to ensure that Oceanbase has the capabilities for this type of workload before enabling it, otherwise single-threaded queries will be long-running. Also note that nearly all columnstores have much higher load times than row-oriented databases, so usually you need to bulk load them manually.

Secondly, just removing the foreign keys because they are not supported may not be a good idea. The foreign keys can significantly benefit the performance of joins between the parent and child tables of which there are many in the TPROC-H workload. (@Jiang-Hua confirmed this with PostgreSQL when implementing partitioning). So this is something that needs to be tested just before removing them.

Depending on the capabilities of the storage engine you are using with Oceanbase it could be better starting with the TPROC-C/OLTP workload.

Also, you should consider how different Oceanbase is from MySQL as there are 2 options here as well. For some changes as above if it works with the regular MySQL schema and minor changes then adding an "Oceanbase" compatible checkbox is the way to go and the default choice.

As an alternative, MariaDB for example was considered different enough from MySQL to be added as a separate database to HammerDB with its own interface etc. The main consideration here is ongoing support and the ability to test new changes and features. Therefore, if this is preferred, it may be another direction to take aside from the default if the goal is more flexibility to make changes.

@sm-shaw Thanks for your suggestion. This could be a good start for me to integrate Oceanbase into HammerDB

@sm-shaw
Copy link
Contributor

sm-shaw commented Sep 1, 2023

Looking a little further Oceanbase does support OLTP (TPROC-C) and OLAP (TPROC-H/analytics) etc with the OLAP approach being columnstore based. Many columnstores do not support foreign keys so that is almost certainly why they are also not supported here.
If you look at MariaDB there is a MySQL compatible columnstore based build already defined, so you could probably copy this.

@sm-shaw
Copy link
Contributor

sm-shaw commented Sep 1, 2023

Looking at MariaDB means the HammerDB TPROC-H schema build for MariaDB when the storage engine entered is "Columnstore". (Also, some of the queries have modifications for compatibility).

@sm-shaw sm-shaw linked a pull request Feb 19, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants