Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better ProcessedVariable #760

Closed
valentinsulzer opened this issue Dec 12, 2019 · 17 comments · Fixed by #781
Closed

Better ProcessedVariable #760

valentinsulzer opened this issue Dec 12, 2019 · 17 comments · Fixed by #781
Assignees

Comments

@valentinsulzer
Copy link
Member

@TomTranter has pointed out several times that processed variables need a rethink and I agree. In particular we need to strike the balance between having to pass lots of inputs to ProcessedVariable and storing too much inside the variable object itself.
However, we could switch to something better by doing e.g.

voltage = model.variables["Terminal voltage [V]"].process(solution)

where the mesh is stored in the variable and .process created a ProcessedVariable.
Or something else? Like

processed_variables = model.process(["Terminal voltage [V]"], solution)
@valentinsulzer
Copy link
Member Author

Actually, solution could just contain the processed variables itself. So

solution = solver.solve(model, t_eval)
voltage = solution["Terminal voltage [V]"].entries # or .data? for the matrix
voltage_at_t1 = solution["Terminal voltage [V]"](t=1)

@valentinsulzer
Copy link
Member Author

This is going to break a lot of existing functionality so would be good to brainstorm ideas before getting started

@rtimms
Copy link
Contributor

rtimms commented Dec 12, 2019

it would also be good to be able to access variables in a more user-friendly way than just having a huge list of variables. maybe they could be organised by submodel and domain, e.g. something like
model.variables["electrolyte"]["negative"]["concentration"] or similar. That way users can quickly look at variables that are associated with one particular cell component. This might require some thought about how best to group things.

@brosaplanella
Copy link
Member

Following @rtimms point, if we go this way it would be good to split them between dimensional and dimensionless given that most of them have both versions.

@rtimms
Copy link
Contributor

rtimms commented Dec 12, 2019

I like the idea of solution having the processed variables in though

@valentinsulzer
Copy link
Member Author

Ok, those two things should definitely be possible but sound like separate issues

@brosaplanella
Copy link
Member

brosaplanella commented Dec 12, 2019

I see. So you mean doing what you suggest on the first post as opposed to the following?

voltage = pybamm.ProcessedVariable(
    model.variables['Terminal voltage [V]'], solution.t, solution.y, mesh=mesh
)

@valentinsulzer
Copy link
Member Author

Yeah exactly

@brosaplanella
Copy link
Member

Not sure what would be the best option as I am not familiar with what PyBaMM is doing when processing a variable. The second approach you suggested, the one with the solution containing the processed variables might be too slow and memory confusing if the solution has a lot of data points.

Given that, if I remember correctly, the mesh is needed for plotting as well, wouldn't it be better to store the mesh in the solution so then we can just do voltage = pybamm.ProcessedVariable(model.variables['Terminal voltage [V]'], solution) and use pybamm.QuickPlot as pybamm.QuickPlot(model, solution)?

@rtimms
Copy link
Contributor

rtimms commented Dec 12, 2019

alternatively, this could all be handled via the simulation object? if everything lives under the umbrella of that then we could do simulation.process_variables() to process all variables simulation.process_variables(variables=[list of vars]) to just process some

@valentinsulzer
Copy link
Member Author

Yes good point, what would be nice though is it we could just do QuickPlot(solution), though as you said we have to be careful to make sure solving isn't too slow and memory intensive - could just pass in a reduced list of variables that we will want to plot at the solving stage

@valentinsulzer
Copy link
Member Author

Yeah simulation makes it easier but would be good if it was also less clunky outside of that

@brosaplanella
Copy link
Member

I agree with @tinosulzer. Could we make the solution class contain everything that is need for postprocessing of the solution (plotting and so on?) but without including all that is included in the simulation class. If so, and if the solution class can be pickled, it means that you can store your output and then you have anything you need for plotting and further analysis.

On the other hand, if this new definition of solution brings it too close of simulation, we might need to consider if we need both.

@valentinsulzer
Copy link
Member Author

Yes that sounds like the best way forward. We could also have the default being "don't process any variables in solution" and then have a solution.update(variables) (or .process or .set) which processes any variables using the existing t and y. I think simulation is still useful and separate for everything leading up to obtaining the solution

@TomTranter
Copy link
Contributor

I was chatting with Tino and also suggested putting process variables in the simulation. I also like the idea of having dictionaries of dictionaries for breaking up the variables like a tree structure. My main issue at first was that I wasn't sure why I couldn't just see the data nicely in the solution.

Can you please explain the main reason why processed variables exist. If it's to do interpolation I feel like that should just be a separate post process or some secondary option where if you pass in a time step it does it and if not it just used the solution time steps.

If it's to help with some of the weird and whacky stuff going on with having an SPM that is only really solved at one point but shown as if it were applied along x then it's more tricky... I still don't fully understand what the different models are doing in this respect. I think the most important thing to get right is passing the data back in a consistent way when you have the N+1D models.

I guess you could have up to 5 dimensions of data?
3 spatial, r and t
In which case the solution would have these dimensions and so would the processed_variables? In which case if for a given simulation these things are consistent it makes sense to tie them together more closely.

Don't know if that helps or makes sense. It also has to be consistent with inputs and depending on where they are going to be stored it would be nice to get that as user friendly as possible

@TomTranter
Copy link
Contributor

I also feel like it's handy to basically have solution as close to an array type data as possible without loading it up with methods and would prefer the simulation class to have that functionality. It's good that you can just append time steps for example and it seems to work. From a saving and loading perspective it's better to save raw data as class definitions change (all the time) then pickled files get unreadable fast

@valentinsulzer
Copy link
Member Author

valentinsulzer commented Dec 12, 2019

Processed variable does 2 things:

  1. Transform column vectors into n-dimensional arrays (up to 5 as you pointed out). Within the model everything is stored as a column vector (e.g. all the particle concentrations are stacked on top of each other), for speed, so it all has to be reshaped at the end in order to get the n-dimensional variables
  2. Set up interpolation, which helps to abstract away having to worry about indices, which of the 5 dimensions corresponds to which, etc

Would the following be ok?

solution["Voltage"] # this is the ProcessedVariable class, with interpolation
solution["Voltage"].data # gives the n-dimensional matrix you want
solution.data # gives dictionary of n-dimensional arrays
solution.save(filename) # saves only the data arrays, to a dictionary

valentinsulzer added a commit that referenced this issue Dec 26, 2019
valentinsulzer added a commit that referenced this issue Jan 6, 2020
valentinsulzer added a commit that referenced this issue Jan 9, 2020
valentinsulzer added a commit that referenced this issue Jan 9, 2020
valentinsulzer added a commit that referenced this issue Jan 9, 2020
valentinsulzer added a commit that referenced this issue Jan 9, 2020
valentinsulzer added a commit that referenced this issue Jan 9, 2020
valentinsulzer added a commit that referenced this issue Jan 10, 2020
valentinsulzer added a commit that referenced this issue Jan 10, 2020
valentinsulzer added a commit that referenced this issue Jan 10, 2020
valentinsulzer added a commit that referenced this issue Jan 12, 2020
valentinsulzer added a commit that referenced this issue Jan 13, 2020
valentinsulzer added a commit that referenced this issue Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants