Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sizeof(variant) #19

Open
springmeyer opened this issue Jun 6, 2014 · 11 comments
Open

sizeof(variant) #19

springmeyer opened this issue Jun 6, 2014 · 11 comments
Milestone

Comments

@springmeyer
Copy link
Contributor

My understanding is that calling sizeof on a variant should produce the size of the largest type the variant includes. And my assumption is that sizeof for a given type is not going to be consistent/portable across arches and platforms. However, as we should strive to have the variant use at little memory as possible I wanted to surface this ticket for discussion.

Dumb questions that I assume the answer is duh, no, but want to know for sure:

  • Would it make sense to try to add tests of sizeof(variant_instance) by adapting to how sizeof(std::string) and other types might be different per platform? Might catch regressions if we ever made a mistake that increased the variant memory footprint.
  • Are there any optimizations to be had from learning/applying ideas from http://www.catb.org/esr/structure-packing/?
  • For custom types should we add some kind of MAX_VARIANT_SIZEOF flag to allow clamping the size - this would then be able to catch a situation were a poorly aligned custom type is larger than other built-in types used in the variant
@joto
Copy link
Contributor

joto commented Jun 26, 2014

Currently class variant has data members std::size_t type_index and data_type data. Here sizeof(data_type) will be the size of the largest type the variant includes. But together they will be at least sizeof(size_t) == 8 larger.

Plus there is the alignment issue:
Say we have a variant<bool, int32_t>, it will be 16 bytes long, 8 for the size_t and 8 for the data_type, although max(sizeof(bool), sizeof(int32_t)) is only 4. So for types smaller than 8 a smaller type_index would buy us something.

@lightmare
Copy link
Contributor

This may sound like heresy, as sizeof...(Ts) returns a size_t, but if instance size is a concern, having size_t type_index seems wasteful. For most practical uses, uint8_t would be enough; and for extreme uses, uint16_t is likely to suffice for a while. I just tried constructing a variant of 1000 integral_constants and got error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum).

I changed my generator template to produce 4 types per recursion and lowered the depth to 500, which postponed the error to the definition of data_size = static_max<sizeof(Types)...> (attempting to find the largest of 2000 different integral_constants). So I added -ftemplate-depth=5000 and tried a variant of 4000 types -- it took ~2.5g or memory and 40+ seconds to compile that nonsense ;)

The smallest type suitable for holding type_index can be selected like this:

template <typename K, typename... Constants>
struct integral_upper_bound
{
    using type = void;
};

template <typename K, typename C, typename... Constants>
struct integral_upper_bound<K, C, Constants...>
{
    using type = typename std::conditional<
                              K::value < C::value,
                              C,
                              typename integral_upper_bound<K, Constants...>::type
                          >::type;
};

template <typename... Types>
struct variant_traits
{
    using invalid_index = typename integral_upper_bound<
                                       std::integral_constant<std::size_t, sizeof...(Types)>,
                                       std::integral_constant<std::uint8_t, UINT8_MAX>,
                                       std::integral_constant<std::uint16_t, UINT16_MAX>,
                                       std::integral_constant<std::uint32_t, UINT32_MAX>,
                                       std::integral_constant<std::size_t, SIZE_MAX>
                                   >::type;

    using index_type = typename invalid_index::value_type;
    static constexpr index_type invalid_value = invalid_index::value;
};

But then again, if you put that in a variant that contains something larger than int, it's not going to make any difference.

@lightmare
Copy link
Contributor

I've found similar index-type shrinking in anthonyw's implementation, and a coarser one is optional in boost (BOOST_VARIANT_MINIMIZE_SIZE)

@joto
Copy link
Contributor

joto commented Jan 27, 2016

I we are only thinking about the size of the index, I think we should just go with int8_t. I can't think of a use case where more than 255 types are needed in a variant except maybe very crazy autogenerated code. But even then our implementation would be very costly in compile time and run time. So I think we can safely ignore that case.

Using an int8_t type_index instead of size_t could lead to considerable space savings for types like std::vector<std::variant<char, uint16_t>>, so I think this is worth pursuing.

What I don't know is whether this could impact runtime negatively. Modern processors like their 32bit or 64bit integers, so there might be a penalty when using an int_8?

@lightmare
Copy link
Contributor

I think uint8_t is fine. At least on x86-64, an operand size prefix is only needed for int16 and int64 (=> longer instructions => longer code => caches not happy), that might be why the BOOST_VARIANT_MINIMIZE_SIZE only considers char and int for the type.

@springmeyer
Copy link
Contributor Author

Using an int8_t type_index instead of size_t could lead to considerable space savings for types like std::vector<std::variant<char, uint16_t>>, so I think this is worth pursuing.

:+1 . My gut says space savings will be meaningful for our usecases while I really really doubt any negative runtime perf hit.

@springmeyer springmeyer added this to the 1.1.0 milestone Jan 28, 2016
@joto
Copy link
Contributor

joto commented Jan 28, 2016

This should be an easy change once we have a more complete set of tests.

@joto joto changed the title sizeof(varint) sizeof(variant) Jan 29, 2016
@artemp
Copy link
Contributor

artemp commented Feb 10, 2016

#include <iostream>
#include "variant.hpp"
#include <boost/variant.hpp>

struct big
{
    double val[10]; // 8 * 10 = 80
};

i
int main()
{
    {
        std::cerr << sizeof(mapbox::util::variant<char>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int,std::string>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int,std::string, big>) << std::endl;
    }
    {
        std::cerr << sizeof(boost::variant<char>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int,std::string>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int,std::string,big>) << std::endl;
    }
    return 0;
}
./test
16
16
32
88
8
8
32
88 
88

From output above it looks like we could do a better job optimising for size when max size stored is sizeof(T) < 16. But this is not a typical use case.
Moving to the next milestone

@artemp artemp modified the milestones: 2.0.0, 1.1.0 Feb 10, 2016
@artemp
Copy link
Contributor

artemp commented Jan 6, 2017

planning to take a look at using uint8_t for type_index
/cc @springmeyer @joto @lightmare

@daniel-j-h
Copy link
Contributor

For the record OSRM just hit this and moved away from using a mapbox/variant internally saving > 11 GB.
Project-OSRM/osrm-backend#3646

artemp added a commit that referenced this issue Feb 3, 2017
…f + use `unsigned int` by default. This addresses `sizeof` discrepancies between boost/std/mapbox variants (ref #19)
artemp added a commit that referenced this issue Feb 3, 2017
@artemp
Copy link
Contributor

artemp commented Feb 3, 2017

std::size_t for storing internal index was too generous and also was resulting in larger sizeof in some use cases comparing to boost::variant and std::variant. After 9eec1fd

#include <iostream>
#include <mapbox/variant.hpp>
#include <boost/variant.hpp>
#include <variant>

struct big
{
    double val[10]; // 8 * 10 = 80
};


int main()
{
    {
        std::cerr << "mapbox::variant" << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int,std::string>) << std::endl;
        std::cerr << sizeof(mapbox::util::variant<char,int,std::string, big>) << std::endl;
    }
    {
        std::cerr << "boost::variant" << std::endl;
        std::cerr << sizeof(boost::variant<char>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int,std::string>) << std::endl;
        std::cerr << sizeof(boost::variant<char,int,std::string,big>) << std::endl;
    }
    {
        std::cerr << "std::variant" << std::endl;
        std::cerr << sizeof(std::variant<char>) << std::endl;
        std::cerr << sizeof(std::variant<char,int>) << std::endl;
        std::cerr << sizeof(std::variant<char,int,std::string>) << std::endl;
        std::cerr << sizeof(std::variant<char,int,std::string,big>) << std::endl;
    }

    return 0;
}
./bin/clang-darwin-4.2.1/debug/variant-size-test 
mapbox::variant
8
8
32
88
boost::variant
8
8
32
88
std::variant
8
8
32
88

/cc @daniel-j-h @springmeyer @joto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants