-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2043 split gbr table #5863
2043 split gbr table #5863
Conversation
Update the fork to latest develop branch
First step in implementing Guestbook-at-Download functionality is to split the GuestbookResponse table into GuestbookResponse and FileDownload and moving a few columns from former into latter.
First step in implementing Guestbook-at-Download functionality is to split the GuestbookResponse table into GuestbookResponse and FileDownload and moving a few columns from former into latter.
…nload methods for functionality moved to the latter.
Merge upstream changes to 2043-split-gb-table branch
@mdmADA hi! @scolapasta asked me to try to explain that we've recently switched to Flyway for database updates so your SQL script should now go in I'll at least put a screenshot of the intro and contents but please read through that page and let us know if anything is confusing: We're still getting used to Flyway to be honest (please see #5862 for example) so please do not hesitate to reach out if you have any questions! The goal with Flyway to to automate the execution of database migration scripts. But we still have to write them by hand. 😄 We attempt to explain what to expect to users (sysadmins upgrading Dataverse) in the Installation Guide. Please see 3b07490 or the screnshot below for the old pre-Flyway explanation compared to the post-Flyway explanation: Ok, one more screenshot from http://phoenix.dataverse.org/schemaspy/latest/tables/flyway_schema_history.html to show you how Flyway track whether or not an upgrade has been run already: Again, please ask if you get stuck! 😄 |
Thanks, Phil, for the info on Flyway. It seems pretty straightforward. Since my code is adding the new filedownload table, a create sql script isn't required. I am wondering how Glassfish will 'know' to set the primary key of filedownload table to be the guestbookresponse_id column (also foreign key to guestbookresponse.id). Since my code alters the guestbookresponse table, an update sql file is required. I will add the file to the src/main/resources/db/migration/ directory. I am planning on naming it V4.14.0.1__2043-split-gbr-table.sql. Thanks! |
… into filedownload and guestbookresponse. Create table not required for new filedownload table. Insert and alter commands required for inserting data from guestbookresponse into filedownload and for modifying guestbookresponse table.
Merge upstream to 2043-split-gb-table
@mdmADA are you talking about the CONSTRAINT above? I believe you'll need to add all the intelligence in the SQL script. From what I understand, Flyway also supports writing migrations in Java, if that helps, but we've never tried. |
@pdurbin - I understood that since my code creates a new table fieldownload, I don't need to provide an sql script that creates that table and that Glassfish will do that during the deploy process. If that understanding is true, and that Glassfish creates the table for 'me', then I was wondering how to add the primary and foreign key contraints for the new filedownload table or if those would be created by Glassfish as well. If Glassfish doesn't take care of this, I can add those constraints to the update sql script in which the fields from guestbookresponse are written to the filedownload table and then those fields dropped from guestbookresponse. Thanks! |
@mdmADA the way to test this would be to simulate what will happen to installations as they upgrade, I would think. Something like this:
Does that make sense? |
Moving to QA. As this PR is backend changes, testing should focus on regression. We should also test how long it will take to do an upgrade on a production-like db using a flyway script. |
@mdmADA We've found an issue with the flyway script: it fails on the drop column line due to a syntax error.
but should be:
Also, please refresh branch from /develop as we've updated the version numbers. Thanks! |
merge upstream develop into branch pull request
Updated! |
@mdmADA Local Exception Stack: |
Right. I was expecting a few issues and will look into these ASAP. |
…G_FOR_PAGE_DISPLAY sql strings to include the join between guestbookresponse and new filedownload table.
Ok - I updated the GuestbookResponseServiceBean.java hard-coded SQL queries to include the join on the new filedownloadtable and the modified guestbookresponsetable to querey downloadtype from the former instead of the latter. I tested the following scenarios with a Dataverse yyyyy that has 2 datasets: 1 with a guestbook and 1 without:
Comments/Questions:
|
I never did respond to these points from @scolapasta: "primary and foreign keys should on the new table should automatically be created. Good question about foreign keys to the new table." - mdmADA: The keys were taken care of and correctly by flyway. "3. A guestbook will eventually have to be associated with a request, but that request should have the link to the file, so my initial though is it's safe to to move these connections to download table."
just to clarify, the guestbookresponse table would then have the columns: and the filedownload table would then have the columns: "4. I think a download cannot exist with a guestbook, but a guestbook will be able to live without a download." -mdmADA: do you mean: a download must have a guestbookresponse (if even default) but a guestbookresponse might not have a download => makes sense to me |
There are also native queries on the guestbookresponse table in GuestbookServiceBean.java and MetricsServiceBean.java; but they don't appear to be using any fields that have been moved to the new table, so we should be ok. (But please double-check that, in case I'm missing something) |
@mdmADA On further testing, this seems to exist on develop branch too. So, not your issue. |
For the purposes of showing the responses in the UI, it looks like we are adding a limit to the query (5K by default; configurable in the settings table). |
Related Issues
Submitting code to split the guestbookresponse table into 2 tables: guestbookresponse and filedownload
This is the first step in moving towards enabling the Guestbook at Request Access so that dataset owners can ask questions of users when they request access and use their answers as a basis to grant or reject access.
I created the FileDownload.java class and modified the GuestbookResponse.java class and created a one-to-one relationship between the two with the GuestbookResponse as the 'parent' and FileDownload as 'child'. guestbookresponse_id in filedownload is a foreign key to guestbookresponse.id and is also the primary key of filedownload.
Questions I have:
1. Not sure if, when a file is not restricted and has no guestbook, should 'Guest' be written to the guestbookresponse.name field? It was (verified) in previous versions (ex. 4.6.1) but isn't now.
I know 'GUEST' has been removed from prefilling the Guestbook pop-up field but the scenario I am asking about is when a datafile is open and the dataset has no guestbook so there is no pre-filled 'Guest' entry at all.
I think writing 'Guest' (or language specific equivalent) into the name when there is no guestbook is a good idea otherwise, when I look at the table and there is nothing written to any of the user-specific-fields, I wonder if that is because there was an issue/bug that prevented the details from being written. Having 'Guest' indicates to me as an admin user that the download process worked properly.
2. Not sure if the timestamp fields are required in BOTH guestbookresponse AND in filedownload or just one of them.
3. Not sure if the DATAFILE_ID, DATASET_ID, DATASETVERSION_ID etc. should be moved to filedownload since these fields ARE related to the filedownload itself.
4. In my meeting with @scolapasta so long ago now, we said the assumption is that if there is a download there will be a guestbookresponse
but Q: Is there a case where there will be a download without a guestbook response?
5. I am not sure how to include the database scripts related to the filedownload creation and guestbookresponse alteration (and inserting any guestbookresponse fields into the new filedownload fields) in the pull request:
#upgrading to version xxx
CREATE TABLE FILEDOWNLOAD(
DOWNLOADTYPE VARCHAR(255),
DOWNLOADTIMESTAMP TIMESTAMP,
SESSIONID VARCHAR(255),
GUESTBOOKRESPONSE_ID BIGINT NOT NULL,
PRIMARY KEY (GUESTBOOKRESPONSE_ID)
);
ALTER TABLE FILEDOWNLOAD ADD CONSTRAINT FK_DOWNLOADS_GUESTBOOKRESPONSE_ID FOREIGN KEY (GUESTBOOKRESPONSE_ID) REFERENCES GUESTBOOKRESPONSE (ID);
#if upgrading to version xxx:
insert into filedownload(GUESTBOOKRESPONSE_ID,DOWNLOADTYPE,DOWNLOADTIMESTAMP,SESSIONID) select ID, DOWNLOADTYPE,RESPONSETIME,SESSIONID from guestbookresponse;
#not sure if this is wanted right away:
alter table GUESTBOOKRESPONSE drop column DOWNLOADTYPE, SESSIONID;