Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different data format for pc group serialization #3018

Merged
merged 5 commits into from
Jan 21, 2024

Conversation

ThomasBreuer
Copy link
Member

@ThomasBreuer ThomasBreuer commented Nov 14, 2023

Use arrays of tuples (i, word) resp. (j, i, word) instead of two parallel arrays
This looks more natural (would apply also for other objects such as sparse matrices).

@antonydellavecchia
Copy link
Collaborator

antonydellavecchia commented Nov 14, 2023

can you post a comparison of the files? There might be a way to use save_object and load_object if there is a way to know the type from some point in tree closer to the root? This is essentially what save_type_params / load_type_params tries to do. For example if you know the type you are serializing will always use some subtype to serialize, you dont need to document that in the file and you can just use save_object which will store the data that needs to be passed to that type. Then on load since you know the structure of the subtype you can just force the data to have that subtype.

some sort of minmal example would be this

julia> data = [(1, 2), (3, 4)]
julia> Oscar.save_object(s, data)
[["1","2"],["3","4"]]

where you store data at some key :power_relators

then when loading at :power_relators you know that the data is supposed to fit into some specific type
in this example tuple{Int, Int} but I didn't need to add that to the file.

I am not 100% sure this would work in your case? but maybe?

save_object(s, reli, :power_reli)
save_object(s, rels, :power_rels)
if length(rels) != 0
save_typed_object(s, rels, :power_rels)
Copy link
Collaborator

@antonydellavecchia antonydellavecchia Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
save_typed_object(s, rels, :power_rels)
save_data_array(s, :power_rels) do
save_object(s, rels)
end

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
save_typed_object(s, rels, :power_rels)
save_object(s, rels, :power_rels)

this should actually work better

GAP.Globals.SetPower(rws, reli[k],
GapObj(GAPWrap.ObjByExtRep(fam, GapObj(rels[k]))))
if haskey(d, :power_rels)
rels = load_typed_object(s, d[:power_rels])
Copy link
Collaborator

@antonydellavecchia antonydellavecchia Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rels = load_typed_object(s, d[:power_rels])
rels = Tuple{Int, Vector{Int}[]
for data_rel in d[:power_rels]
loaded_rel = load_object(s, Tuple{Int, Vector{Int}}, data_rel)
push!(rels, loaded_rel)
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see:
If the data file does not specify the type of the array entries then I can run over the entries and set the type for each of them, instead of trying to specify the type of the whole array.
I will try this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait: I get the following.

julia> rel = Any["2", Any["3", "1"]];

julia> @which Oscar.load_object(s, Tuple{Int, Vector{Int}}, rel)
ERROR: no unique matching method found for the specified argument types

The code in Serialization/containers.jl suggests that I should try

julia> Oscar.load_object(s, Tuple{Int, Vector{Int}}, rel, [Int, [Vector{Int}, Int]])
(2, [3, 1])

Looks complicated but seems to work. I will update the pull request.

Copy link
Collaborator

@antonydellavecchia antonydellavecchia Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ya ok. Ya this is due to the fact that arbitrary tuples can get complicated. I will hopefully get round to simplifying this at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message for me is:

  • Avoid nested vectors.
  • Vectors of tuples are fine (although they look like nested vectors in the data files), but deserialize each tuple separately. This way, it is not necessary to store type information about these objects in the data files.

save_object(s, reli, :comm_reli)
save_object(s, rels, :comm_rels)
if length(rels) != 0
save_typed_object(s, rels, :comm_rels)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be dealt with similar to above situation

GAP.Globals.SetCommutator(rws, j, i,
GapObj(GAPWrap.ObjByExtRep(fam, GapObj(rels[k]))))
if haskey(d, :comm_rels)
rels = load_typed_object(s, d[:comm_rels])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to above I believe?

@ThomasBreuer ThomasBreuer marked this pull request as ready for review November 14, 2023 16:22
@ThomasBreuer ThomasBreuer changed the title different data format for pc group serialization (do not merge) different data format for pc group serialization Nov 14, 2023
GAP.Globals.SetPower(rws, reli[k],
GapObj(GAPWrap.ObjByExtRep(fam, GapObj(rels[k]))))
for data_rel in d[:power_rels]
(i, elm) = load_object(s, Tuple{Int, Vector{Int}}, data_rel,
Copy link
Collaborator

@antonydellavecchia antonydellavecchia Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this work?

Suggested change
(i, elm) = load_object(s, Tuple{Int, Vector{Int}}, data_rel,
(i, elm) = load_object(s, Tuple, data_rel, [Int, Vector{Int}])

in any case you can certainly just replace Tuple{Int, Vector{Int}} with Tuple which should help a bit

for k in 1:length(reli)
(j, i) = reli[k]
for data_rel in d[:comm_rels]
(j, i, elm) = load_object(s, Tuple{Int, Int, Vector{Int}}, data_rel,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly to above

Suggested change
(j, i, elm) = load_object(s, Tuple{Int, Int, Vector{Int}}, data_rel,
(j, i, elm) = load_object(s, Tuple, data_rel,

(j, i) = reli[k]
for data_rel in d[:comm_rels]
(j, i, elm) = load_object(s, Tuple{Int, Int, Vector{Int}}, data_rel,
[Int, Int, [Vector{Int}, Int]])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? this needs to be checked

Suggested change
[Int, Int, [Vector{Int}, Int]])
[Int, Int, Vector{Int}])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or?

Suggested change
[Int, Int, [Vector{Int}, Int]])
[Int, Int, [Vector, Int]])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the latter works. Thanks.

Concerning the proposal to specify just Tuple as the second argument of load_object: Yes, this also works.
On the other hand, the second argument is logically the right place for specifying the desired type of the result (if it would be possible to omit the fourth argument).

I am going to update the pull request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I also agree, I think the issue is there isn't currently a function that can break down the second argument into all its params in the right way so that I can load an arbitrary tuple. I will make an attempt at dealing with this in a later pull request.

Copy link
Collaborator

@antonydellavecchia antonydellavecchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor details on simplify a bit, otherwise looks good

- use arrays of tuples `(i, word)` resp. `(j, i, word)`
  instead of two parallel arrays
- as a consequence, use `save_typed_object`/`load_typed_object`
  instead of `save_object`/`load_object`
The syntax `load_object(s, T, data, params)` is no longer supported.
(The syntax used now can be regarded as simpler,
once one has found out what one actually has to write down.)
Copy link

codecov bot commented Jan 13, 2024

Codecov Report

Merging #3018 (5df2ada) into master (76777a9) will increase coverage by 0.00%.
Report is 4 commits behind head on master.
The diff coverage is 94.11%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3018   +/-   ##
=======================================
  Coverage   81.14%   81.14%           
=======================================
  Files         556      556           
  Lines       73894    73900    +6     
=======================================
+ Hits        59962    59968    +6     
  Misses      13932    13932           
Files Coverage Δ
src/Serialization/GAP.jl 94.83% <94.11%> (+0.22%) ⬆️

... and 8 files with indirect coverage changes

@ThomasBreuer
Copy link
Member Author

@antonydellavecchia I have adjusted the proposed changes to the changed syntax, the code became simpler now.

Copy link
Collaborator

@antonydellavecchia antonydellavecchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@ThomasBreuer ThomasBreuer merged commit 68a19c2 into oscar-system:master Jan 21, 2024
24 of 25 checks passed
@ThomasBreuer ThomasBreuer deleted the TB_serialize_pc_groups branch January 21, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants