Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2043 split gbr table #5863

Merged
merged 24 commits into from
Jun 20, 2019
Merged

2043 split gbr table #5863

merged 24 commits into from
Jun 20, 2019

Conversation

mdmADA
Copy link
Contributor

@mdmADA mdmADA commented May 21, 2019

Related Issues

Submitting code to split the guestbookresponse table into 2 tables: guestbookresponse and filedownload

This is the first step in moving towards enabling the Guestbook at Request Access so that dataset owners can ask questions of users when they request access and use their answers as a basis to grant or reject access.

I created the FileDownload.java class and modified the GuestbookResponse.java class and created a one-to-one relationship between the two with the GuestbookResponse as the 'parent' and FileDownload as 'child'. guestbookresponse_id in filedownload is a foreign key to guestbookresponse.id and is also the primary key of filedownload.

Questions I have:

1. Not sure if, when a file is not restricted and has no guestbook, should 'Guest' be written to the guestbookresponse.name field? It was (verified) in previous versions (ex. 4.6.1) but isn't now.

I know 'GUEST' has been removed from prefilling the Guestbook pop-up field but the scenario I am asking about is when a datafile is open and the dataset has no guestbook so there is no pre-filled 'Guest' entry at all.

I think writing 'Guest' (or language specific equivalent) into the name when there is no guestbook is a good idea otherwise, when I look at the table and there is nothing written to any of the user-specific-fields, I wonder if that is because there was an issue/bug that prevented the details from being written. Having 'Guest' indicates to me as an admin user that the download process worked properly.

2. Not sure if the timestamp fields are required in BOTH guestbookresponse AND in filedownload or just one of them.

3. Not sure if the DATAFILE_ID, DATASET_ID, DATASETVERSION_ID etc. should be moved to filedownload since these fields ARE related to the filedownload itself.

4. In my meeting with @scolapasta so long ago now, we said the assumption is that if there is a download there will be a guestbookresponse
but Q: Is there a case where there will be a download without a guestbook response?

5. I am not sure how to include the database scripts related to the filedownload creation and guestbookresponse alteration (and inserting any guestbookresponse fields into the new filedownload fields) in the pull request:

#upgrading to version xxx
CREATE TABLE FILEDOWNLOAD(
DOWNLOADTYPE VARCHAR(255),
DOWNLOADTIMESTAMP TIMESTAMP,
SESSIONID VARCHAR(255),
GUESTBOOKRESPONSE_ID BIGINT NOT NULL,
PRIMARY KEY (GUESTBOOKRESPONSE_ID)
);

ALTER TABLE FILEDOWNLOAD ADD CONSTRAINT FK_DOWNLOADS_GUESTBOOKRESPONSE_ID FOREIGN KEY (GUESTBOOKRESPONSE_ID) REFERENCES GUESTBOOKRESPONSE (ID);

#if upgrading to version xxx:
insert into filedownload(GUESTBOOKRESPONSE_ID,DOWNLOADTYPE,DOWNLOADTIMESTAMP,SESSIONID) select ID, DOWNLOADTYPE,RESPONSETIME,SESSIONID from guestbookresponse;

#not sure if this is wanted right away:
alter table GUESTBOOKRESPONSE drop column DOWNLOADTYPE, SESSIONID;

mdmADA added 16 commits May 14, 2019 12:39
Update the fork to latest develop branch
First step in implementing Guestbook-at-Download functionality is to split the GuestbookResponse table
into GuestbookResponse and FileDownload and moving a few columns from former into latter.
First step in implementing Guestbook-at-Download functionality is to split the GuestbookResponse table
into GuestbookResponse and FileDownload and moving a few columns from former into latter.
…nload methods for functionality moved to the latter.
Merge upstream changes to 2043-split-gb-table branch
@mdmADA mdmADA changed the title 2043 split guestbookresponse table into guestbookresponse and filedownload 2043 split gbr table May 21, 2019
@coveralls
Copy link

coveralls commented May 21, 2019

Coverage Status

Coverage decreased (-0.004%) to 19.676% when pulling 4d82505 on mdmADA:2043-split-gb-table into fbdd587 on IQSS:develop.

@pdurbin
Copy link
Member

pdurbin commented May 22, 2019

5. I am not sure how to include the database scripts

@mdmADA hi! @scolapasta asked me to try to explain that we've recently switched to Flyway for database updates so your SQL script should now go in src/main/resources/db/migration as explained in http://guides.dataverse.org/en/4.14/developers/sql-upgrade-scripts.html

I'll at least put a screenshot of the intro and contents but please read through that page and let us know if anything is confusing:

Screen Shot 2019-05-22 at 3 39 07 PM

We're still getting used to Flyway to be honest (please see #5862 for example) so please do not hesitate to reach out if you have any questions!

The goal with Flyway to to automate the execution of database migration scripts. But we still have to write them by hand. 😄 We attempt to explain what to expect to users (sysadmins upgrading Dataverse) in the Installation Guide. Please see 3b07490 or the screnshot below for the old pre-Flyway explanation compared to the post-Flyway explanation:

Screen Shot 2019-05-22 at 3 36 33 PM

Ok, one more screenshot from http://phoenix.dataverse.org/schemaspy/latest/tables/flyway_schema_history.html to show you how Flyway track whether or not an upgrade has been run already:

Screen Shot 2019-05-22 at 3 41 28 PM

Again, please ask if you get stuck! 😄

@pdurbin pdurbin removed their assignment May 22, 2019
@mdmADA
Copy link
Contributor Author

mdmADA commented May 23, 2019

Thanks, Phil, for the info on Flyway. It seems pretty straightforward. Since my code is adding the new filedownload table, a create sql script isn't required. I am wondering how Glassfish will 'know' to set the primary key of filedownload table to be the guestbookresponse_id column (also foreign key to guestbookresponse.id).

Since my code alters the guestbookresponse table, an update sql file is required. I will add the file to the src/main/resources/db/migration/ directory. I am planning on naming it V4.14.0.1__2043-split-gbr-table.sql.

Thanks!

mdmADA added 2 commits May 23, 2019 15:44
… into filedownload and guestbookresponse.

Create table not required for new filedownload table.
Insert and alter commands required for inserting data from guestbookresponse into filedownload and for modifying guestbookresponse table.
Merge upstream to 2043-split-gb-table
@pdurbin
Copy link
Member

pdurbin commented May 23, 2019

I am wondering how Glassfish will 'know' to set the primary key of filedownload table to be the guestbookresponse_id column (also foreign key to guestbookresponse.id).

@mdmADA are you talking about the CONSTRAINT above? I believe you'll need to add all the intelligence in the SQL script. From what I understand, Flyway also supports writing migrations in Java, if that helps, but we've never tried.

@mdmADA
Copy link
Contributor Author

mdmADA commented May 24, 2019

@pdurbin - I understood that since my code creates a new table fieldownload, I don't need to provide an sql script that creates that table and that Glassfish will do that during the deploy process.

If that understanding is true, and that Glassfish creates the table for 'me', then I was wondering how to add the primary and foreign key contraints for the new filedownload table or if those would be created by Glassfish as well.

If Glassfish doesn't take care of this, I can add those constraints to the update sql script in which the fields from guestbookresponse are written to the filedownload table and then those fields dropped from guestbookresponse.

Thanks!

@pdurbin
Copy link
Member

pdurbin commented May 24, 2019

@mdmADA the way to test this would be to simulate what will happen to installations as they upgrade, I would think. Something like this:

  • Install Dataverse from the "develop" branch.
  • Switch to the mdmADA 2043-split-gb-table branch.
  • Build and deploy the war file, Glassfish will create the new table and Flyway will run V4.14.0.1__2043-split-gbr-table.sql
  • Check Postgres. Are the constraints you want in place? If so, you're done. If not, you'll have to add them to the Flyway script, right?

Does that make sense?

@djbrooke
Copy link
Contributor

Moving to QA. As this PR is backend changes, testing should focus on regression. We should also test how long it will take to do an upgrade on a production-like db using a flyway script.

@kcondon
Copy link
Contributor

kcondon commented Jun 14, 2019

@mdmADA We've found an issue with the flyway script: it fails on the drop column line due to a syntax error.
It appears as:

alter table guestbookresponse drop column DOWNLOADTYPE, SESSIONID;

but should be:

alter table guestbookresponse drop column DOWNLOADTYPE, drop column SESSIONID;

Also, please refresh branch from /develop as we've updated the version numbers. Thanks!

@mdmADA
Copy link
Contributor Author

mdmADA commented Jun 18, 2019

Updated!
And branch refreshed from /develop.

@kcondon
Copy link
Contributor

kcondon commented Jun 18, 2019

@mdmADA
Retested after changes. Can now deploy, thanks!
Flyway script successfully splits the table and preserves records. However, post deployment/migration, the download responses, download all responses, and view responses functions all throw ui exception and server.log complains about missing download type:

Local Exception Stack:
Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.5.2.v20140319-9ad 6abd): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.postgresql.util.PSQLException: ERROR: column r.downloadt ype does not exist
Position: 40
Error Code: 0
Call: select r.id, v.value, r.responsetime, r.downloadtype, m.label, r.name fr om guestbookresponse r, datasetfieldvalue v, filemetadata m , dvobject o where v.datasetfield_id = (select id from datasetfield f where datasetfieldtype_id = 1 and datasetversion_id = (select max(id) from datasetversion where dataset_id = r.dataset_id )) and m.datasetversion_id = (select max(datasetversion_id) from f ilemetadata where datafile_id =r.datafile_id ) and m.datafile_id = r.datafile_i d and r.dataset_id = o.id and o.owner_id = 2 and r.guestbook_id = 2 order by r.id desc limit 5000;
Query: DataReadQuery(sql="select r.id, v.value, r.responsetime, r.downloadtype, m.label, r.name from guestbookresponse r, datasetfieldvalue v, filemetadata m , dvobject o where v.datasetfield_id = (select id from datasetfield f where dat asetfieldtype_id = 1 and datasetversion_id = (select max(id) from datasetversio n where dataset_id =r.dataset_id )) and m.datasetversion_id = (select max(datas etversion_id) from filemetadata where datafile_id =r.datafile_id ) and m.datafi le_id = r.datafile_id and r.dataset_id = o.id and o.owner_id = 2 and r.guest book_id = 2 order by r.id desc limit 5000;")
at org.eclipse.persistence.exceptions.DatabaseException.sqlException(Dat abaseException.java:340)
at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.proc essExceptionForCommError(DatabaseAccessor.java:1611)
at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basi cExecuteCall(DatabaseAccessor.java:674)
at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.exec uteCall(DatabaseAccessor.java:558)
at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecut eCall(AbstractSession.java:2002)
at org.eclipse.persistence.sessions.server.ClientSession.executeCall(Cli entSession.java:298)
at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism .executeCall(DatasourceCallQueryMechanism.java:242)
at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism .executeCall(DatasourceCallQueryMechanism.java:228)
at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism .executeSelectCall(DatasourceCallQueryMechanism.java:299)
at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism .executeSelect(DatasourceCallQueryMechanism.java:281)
at org.eclipse.persistence.queries.DataReadQuery.executeNonCursor(DataRe adQuery.java:197)
at org.eclipse.persistence.queries.DataReadQuery.executeDatabaseQuery(Da taReadQuery.java:152)
at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.j ava:899)
at org.eclipse.persistence.queries.DataReadQuery.execute(DataReadQuery.j ava:137)
at org.eclipse.persistence.queries.DatabaseQuery.executeInUnitOfWork(Dat abaseQuery.java:798)
at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExec uteQuery(UnitOfWorkImpl.java:2896)
at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuer y(AbstractSession.java:1804)
at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuer y(AbstractSession.java:1786)
at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuer y(AbstractSession.java:1751)
at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(Query Impl.java:258)
at org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImp l.java:469)
at edu.harvard.iq.dataverse.GuestbookResponseServiceBean.findArrayByGues tbookIdAndDataverseId(GuestbookResponseServiceBean.java:253)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(E JBSecurityManager.java:1081)
at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBS ecurityManager.java:1153)
at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.j ava:4786)
at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext( InterceptorManager.java:822)
at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroun dInvoke(AbstractEJBRequestScopeActivationInterceptor.java:64)
at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInt erceptor.java:52)
at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept (InterceptorManager.java:883)
at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext( InterceptorManager.java:822)
at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(Sys temInterceptorProxy.java:163)
at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvo ke(SystemInterceptorProxy.java:140)
at sun.reflect.GeneratedMethodAccessor119.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept (InterceptorManager.java:883)
at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext( InterceptorManager.java:822)
at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(Inte rceptorManager.java:369)
at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4 758)
at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:474 6)
at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLoca lObjectInvocationHandler.java:212)
at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke (EJBLocalObjectInvocationHandlerDelegate.java:88)
at com.sun.proxy.$Proxy517.findArrayByGuestbookIdAndDataverseId(Unknown Source)
at edu.harvard.iq.dataverse.EJB31_Generated__GuestbookResponseServiceB ean__Intf____Bean.findArrayByGuestbookIdAndDataverseId(Unknown Source)
at edu.harvard.iq.dataverse.GuestbookResponsesPage.init(GuestbookRespons esPage.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.el.parser.AstValue.invoke(AstValue.java:289)
at com.sun.el.MethodExpressionImpl.invoke(MethodExpressionImpl.java:304)
at org.jboss.weld.util.el.ForwardingMethodExpression.invoke(ForwardingMe thodExpression.java:40)
at org.jboss.weld.el.WeldMethodExpression.invoke(WeldMethodExpression.ja va:50)
at com.sun.faces.facelets.el.TagMethodExpression.invoke(TagMethodExpress ion.java:105)
at javax.faces.component.MethodBindingMethodExpressionAdapter.invoke(Met hodBindingMethodExpressionAdapter.java:87)
at com.sun.faces.application.ActionListenerImpl.processAction(ActionList enerImpl.java:102)
at javax.faces.component.UIViewAction.broadcast(UIViewAction.java:562)
at javax.faces.component.UIViewRoot.broadcastEvents(UIViewRoot.java:790)
at javax.faces.component.UIViewRoot.processApplication(UIViewRoot.java:1 282)
at com.sun.faces.lifecycle.InvokeApplicationPhase.execute(InvokeApplicat ionPhase.java:81)
at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:101)
at com.sun.faces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:198)
at javax.faces.webapp.FacesServlet.service(FacesServlet.java:646)
at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java :1682)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl icationFilterChain.java:344)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF ilterChain.java:214)
at org.glassfish.tyrus.servlet.TyrusServletFilter.doFilter(TyrusServletF ilter.java:295)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl icationFilterChain.java:256)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF ilterChain.java:214)
at org.ocpsoft.rewrite.servlet.RewriteFilter.doFilter(RewriteFilter.java :226)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl icationFilterChain.java:256)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF ilterChain.java:214)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV alve.java:316)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextV alve.java:160)
at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.j ava:734)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.jav a:673)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:99)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j ava:174)
at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.j ava:415)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav a:282)
at com.sun.enterprise.v3.services.impl.ContainerMapper$HttpHandlerCallab le.call(ContainerMapper.java:459)
at com.sun.enterprise.v3.services.impl.ContainerMapper.service(Container Mapper.java:167)
at org.glassfish.grizzly.http.server.HttpHandler.runService(HttpHandler. java:201)
at org.glassfish.grizzly.http.server.HttpHandler.doHandle(HttpHandler.ja va:175)
at org.glassfish.grizzly.http.server.HttpServerFilter.handleRead(HttpSer verFilter.java:235)
at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(Executor Resolver.java:119)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(De faultFilterChain.java:284)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart (DefaultFilterChain.java:201)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultF ilterChain.java:133)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultF ilterChain.java:112)
at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.jav a:77)
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNI OTransport.java:561)
at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(Abstr actIOStrategy.java:112)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerTh readIOStrategy.java:117)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(Wo rkerThreadIOStrategy.java:56)
at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadR unnable.run(WorkerThreadIOStrategy.java:137)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(Abs tractThreadPool.java:565)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(Abstra ctThreadPool.java:545)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: column r.downloadtype does not exist
Position: 40
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryEx ecutorImpl.java:2433)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutor Impl.java:2178)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.ja va:306)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedSt atement.java:155)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatem ent.java:118)
at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.postgresql.ds.PGPooledConnection$StatementHandler.invoke(PGPooled Connection.java:428)
at com.sun.proxy.$Proxy237.executeQuery(Unknown Source)
at com.sun.gjc.spi.jdbc40.PreparedStatementWrapper40.executeQuery(Prepar edStatementWrapper40.java:642)
at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.exec uteSelect(DatabaseAccessor.java:1007)
at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basi cExecuteCall(DatabaseAccessor.java:642)
... 106 more
]]

@mdmADA
Copy link
Contributor Author

mdmADA commented Jun 19, 2019

Right. I was expecting a few issues and will look into these ASAP.

mdmADA added 2 commits June 20, 2019 13:05
…G_FOR_PAGE_DISPLAY sql strings to include the join between guestbookresponse and new filedownload table.
@mdmADA
Copy link
Contributor Author

mdmADA commented Jun 20, 2019

Ok - I updated the GuestbookResponseServiceBean.java hard-coded SQL queries to include the join on the new filedownloadtable and the modified guestbookresponsetable to querey downloadtype from the former instead of the latter.

I tested the following scenarios with a Dataverse yyyyy that has 2 datasets: 1 with a guestbook and 1 without:

  1. Navigate to specific Dataverse yyyyy
  • click 'Edit Dataset Guestbooks'
    => takes user to manage-guestbooks.xhtml?dataverseId=
  • click Download All Responses
    -> downloads csv file with all records of downloads of all files in all datasets within the specific dataverse
    -> works (no exception thrown and csv file shows correct information)
  1. In same page manage-guestbooks.xhtml?dataverseId=:
  • click 'View Responses' associated with specific Guestbook instance
  • takes user to guestbook-responses.xhtml?dataverseId=&guestbookId=
    -> shows all the responses of all users who have entered guestbook details and downloaded files
    -> works (no exception thrown and shows correct information)
  1. In same page as guestbook-responses.xhtml?dataverseId=&guestbookId=:
  • Click "Download Responses"
  • Downloads .csv file with all responses as shown in the user interface of 2. above
    -> works (no exception thrown and csv file has correct information)

Comments/Questions:

  1. I am not sure of the efficiency of the queries themselves particularly wrt very large numbers of responses and rendering them in the UI.

  2. Are there any other points in the code that are affected by the splitting of gbr into gbr+filedownload?

@mdmADA
Copy link
Contributor Author

mdmADA commented Jun 20, 2019

@mdmADA primary and foreign keys should on the new table should automatically be created. Good question about foreign keys to the new table.

Some comments on your other questions:

  1. @jggautier any thoughts in this since I believe you were involved in the reasoning to remove?
  2. I think we will want them in both since the idea is to eventually have times when they could be vastly different (e.g. user fills out guestbook to request access, waits to be given access, then downloads a week later).
  3. A guestbook will eventually have to be associated with a request, but that request should have the link to the file, so my initial though is it's safe to to move these connections to download table.
  4. I think a download cannot exist with a guestbook, but a guestbook will be able to live without a download.

I never did respond to these points from @scolapasta:

"primary and foreign keys should on the new table should automatically be created. Good question about foreign keys to the new table." - mdmADA: The keys were taken care of and correctly by flyway.

"3. A guestbook will eventually have to be associated with a request, but that request should have the link to the file, so my initial though is it's safe to to move these connections to download table."

  • mdmADA: I left datafile_id, dataset_id, etc. in the guestbookresponse table so if you feel that they really should be in the filedownload table, it would be best to move them now I would think.

just to clarify, the guestbookresponse table would then have the columns:
id (pk)
email
institution
name
position
responsetime
authenticateduser_id
guestbook_id

and the filedownload table would then have the columns:
downloadtimestamp
downloadtype
guestbookresponse_id => pk and also fk to guestbookresponse.id
sessionid
datafile_id
dataset_id
datasetversion_id

"4. I think a download cannot exist with a guestbook, but a guestbook will be able to live without a download." -mdmADA: do you mean: a download must have a guestbookresponse (if even default) but a guestbookresponse might not have a download => makes sense to me

@landreev
Copy link
Contributor

2. Are there any other points in the code that are affected by the splitting of gbr into gbr+filedownload?

There are also native queries on the guestbookresponse table in GuestbookServiceBean.java and MetricsServiceBean.java; but they don't appear to be using any fields that have been moved to the new table, so we should be ok. (But please double-check that, in case I'm missing something)

@kcondon
Copy link
Contributor

kcondon commented Jun 20, 2019

@mdmADA
OK, now download/view responses work, thanks. However, based on your question about other functional areas, we looked at the metrics api endpoints for the downloads type. It appears to work for pastDays endpoint but not for total or toMonth. Both return 0.
http://guides.dataverse.org/en/4.15/api/metrics.html

On further testing, this seems to exist on develop branch too. So, not your issue.
Actually, it was my issue in interpreting how the endpoints worked. All my downloads happened in this month. By definition, toMonth and Total (no qualifier) operate upon downloads up to but not including the present month.

@landreev
Copy link
Contributor

1. I am not sure of the efficiency of the queries themselves particularly wrt very large numbers of responses and rendering them in the UI.

For the purposes of showing the responses in the UI, it looks like we are adding a limit to the query (5K by default; configurable in the settings table).
For the purposes of downloading the responses as a CSV file, it will try to get everything. I believe the assumption was that the download operation did not need to be super fast... But yes, this is a problem, potentially. You can have more entries in that table than can be stuffed into memory.
I believe the real issue here is that we allow that table to grow indefinitely; on a busy system that servers a lot of downloads that can eventually flood the database. We need some mechanism for trimming and/or archiving the older results; keeping the counts, but removing the detailed entries for the responses older than some number of months maybe?

@kcondon kcondon merged commit 631db06 into IQSS:develop Jun 20, 2019
@kcondon kcondon assigned kcondon and unassigned mdmADA Jun 25, 2019
pdurbin added a commit that referenced this pull request Jun 27, 2019
@djbrooke djbrooke added this to the 4.15.1 milestone Jul 10, 2019
qqmyers added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this pull request Sep 22, 2023
qqmyers added a commit to QualitativeDataRepository/dataverse that referenced this pull request Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants