Results 1 to 13 of 13

Thread: std::vector differences in g++ and VC++

  1. #1
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

    std::vector differences in g++ and VC++

    While debugging zpaq I found that g++ and VC++ treat std::vector differently when the element type has a constructor, such as a vector<string>. The following program creates a vector of 3 elements, then resizes it to 5, and then 1. Each element has a std::string and a pointer to allocated memory. I wrote a copy constructor, assignment operator, and destructor that are identical to the defaults supplied. (You get the same behavior if you remove them). The default is to copy, assign, or destroy each of the members of the object. (This is incorrect for pointers. Nevertheless I did it this way to see what was happening).

    The declaration vector<T> v(n) creates n elements of type T. g++ constructs a single temporary object, then makes n bitwise copies, and destroys the temporary. This is obviously incorrect if T allocates memory because all of the objects point to the same memory. All of the strings point to the same memory, but it works because g++ uses copy-on-write. VC++ creates a temporary object once for each element, copies it to the vector, and destroys the temporary. This is correct, but inefficient. When a string is copied in VC++, the value is copied to new memory.

    When a vector is enlarged using resize(), it must allocate new elements and copy over the old ones to the new array. Both do this correctly using the copy constructor. New elements are initialized in the same way as when a vector is created.

    When a vector is resized to a smaller size, neither g++ nor VC++ free any memory. Both correctly call the destructor on the removed elements. g++ inexplicably creates a temporary object (as if to initialize new elements) and then destroys it without using it.

    In another experiment (not shown), I found that with a series of push_back()s, that when g++ needs to enlarge the vector, it doubles the reserved space. VC++ allocates in smaller chunks which is more memory efficient but becomes slow when the vector gets large.

    Code:
    #include <stdio.h>
    #include <string>
    #include <vector>
    using namespace std;
    
    struct T {
      string s;
      char* p;
      T(): s(1<<24, 0), p(new char[1<<24]) {printf("constructor\n");}
      ~T() {printf("destructor\n");}
      T(const T& x): s(x.s), p(x.p) {printf("copy constructor\n");}
      T& operator=(const T& x) {s=x.s, p=x.p; printf("operator=\n"); return *this;}
    };
    
    int main() {
      printf("\n string copy constructor behavior\n");
      string s="hello";
      string t=s;
      printf("         s=%s %p t=%s %p\n", s.c_str(), s.c_str(), t.c_str(), t.c_str());
      t.resize(4);
      printf("resized: s=%s %p t=%s %p\n\n", s.c_str(), s.c_str(), t.c_str(), t.c_str());
    
      vector<T> v(3);
      printf(" Create with size 3 at %p\n", &v[0]);
      for (unsigned i=0; i<v.size(); ++i)
        printf("%d %d %p %p\n", i, v[i].s.size(), v[i].s.c_str(), v[i].p);
    
      v.resize(5);
      printf("\n resize 3 to 5 at %p\n", &v[0]);
      for (unsigned i=0; i<v.size(); ++i)
        printf("%d %d %p %p\n", i, v[i].s.size(), v[i].s.c_str(), v[i].p);
    
      printf("\n resize 5 to 1 at %p\n", &v[0]);
      v.resize(1);
      for (unsigned i=0; i<v.size(); ++i)
        printf("%d %d %p %p\n", i, v[i].s.size(), v[i].s.c_str(), v[i].p);
    
      return 0;
    }
    Output:

    Code:
    C:\res\jidac>g++ -O3 a.cpp -o a.exe
    
    C:\res\jidac>a
    
     string copy constructor behavior
             s=hello 002E3474 t=hello 002E3474
    resized: s=hello 002E3474 t=hell 002E3494
    
    constructor
    copy constructor
    copy constructor
    copy constructor
    destructor
     Create with size 3 at 002E34A8
    0 16777216 0098002C 01990020
    1 16777216 0098002C 01990020
    2 16777216 0098002C 01990020
    constructor
    copy constructor
    copy constructor
    copy constructor
    copy constructor
    copy constructor
    destructor
    destructor
    destructor
    destructor
    
     resize 3 to 5 at 002E34C8
    0 16777216 0098002C 01990020
    1 16777216 0098002C 01990020
    2 16777216 0098002C 01990020
    3 16777216 029A002C 039B0020
    4 16777216 029A002C 039B0020
    
     resize 5 to 1 at 002E34C8
    constructor
    destructor
    destructor
    destructor
    destructor
    destructor
    0 16777216 0098002C 01990020
    destructor
    
    C:\res\jidac>cl /O2 /EHsc a.cpp
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    a.cpp
    Microsoft (R) Incremental Linker Version 10.00.30319.01
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    /out:a.exe
    a.obj
    
    C:\res\jidac>a
    
     string copy constructor behavior
             s=hello 002CFEE0 t=hello 002CFEC4
    resized: s=hello 002CFEE0 t=hell 002CFEC4
    
    constructor
    copy constructor
    destructor
    constructor
    copy constructor
    destructor
    constructor
    copy constructor
    destructor
     Create with size 3 at 006D19F8
    0 16777216 02900020 018F0020
    1 16777216 04920020 03910020
    2 16777216 06940020 05930020
    copy constructor
    copy constructor
    copy constructor
    destructor
    destructor
    destructor
    constructor
    copy constructor
    destructor
    constructor
    copy constructor
    destructor
    
     resize 3 to 5 at 006D1A60
    0 16777216 008E0020 018F0020
    1 16777216 07950020 03910020
    2 16777216 08960020 05930020
    3 16777216 06940020 04920020
    4 16777216 0A980020 09970020
    
     resize 5 to 1 at 006D1A60
    destructor
    destructor
    destructor
    destructor
    0 16777216 008E0020 018F0020
    destructor

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Matt Mahoney View Post
    g++ constructs a single temporary object, then makes n bitwise copies, and destroys the temporary.
    What I see in your (and mine) output is:
    Code:
    constructor
    copy constructor
    copy constructor
    copy constructor
    destructor
    The copy constructor is called and does what you ask it to do.

    This is obviously incorrect if T allocates memory because all of the objects point to the same memory. All of the strings point to the same memory, but it works because g++ uses copy-on-write. VC++ creates a temporary object once for each element, copies it to the vector, and destroys the temporary. This is correct, but inefficient. When a string is copied in VC++, the value is copied to new memory.

    When a vector is enlarged using resize(), it must allocate new elements and copy over the old ones to the new array. Both do this correctly using the copy constructor. New elements are initialized in the same way as when a vector is created.

    When a vector is resized to a smaller size, neither g++ nor VC++ free any memory. Both correctly call the destructor on the removed elements. g++ inexplicably creates a temporary object (as if to initialize new elements) and then destroys it without using it.
    Looks like a bug. The standard is clear that the function is
    Code:
    void resize( size_type count, T value = T() );
    , so it has to call the constructor regardless of whether it increases or decreases the size. There might be some rule overwriting it, but it's likely that MS traded correctness for speed.

  3. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Not sure which is correct according to the standard at http://www.cplusplus.com/reference/stl/vector/vector/
    g++ prints "1". VC++ prints "1 2 3 4 5".

    Code:
    #include <stdio.h>
    #include <vector>
    struct T {
      static int count;
      T() {printf("%d ", ++count);}
    };
    int T::count=0;
    int main() {
      std::vector<T> v(5);
    }
    But "new T[5];" prints "1 2 3 4 5" for both compilers.

  4. #4
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    The standard ISO/IEC 14882:1998, 23.1.1 states "construct a sequence with n copies of T". I'd say gcc is right again.

  5. #5
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Depends what the standard means by "n copies of T". You can define what copy means, and you can define what it means for two objects to be equal. If two strings point to different arrays with the same contents, are they equal? g++ and VC++ seem to disagree.

  6. #6
    Member Karhunen's Avatar
    Join Date
    Dec 2011
    Location
    USA
    Posts
    91
    Thanks
    2
    Thanked 1 Time in 1 Post
    If I am reading it right, to be a "copy" means to be an equivalent sequence ( ordered list ) but there is no requirement that *this A1 equals *this A2. I assume this since the next section describes associative containers and I see the last statement of comp,comp(A1,A2) && comp,comp(A2,A1) = false i.e. that the keys of A1 and A2 are equal in sequence.Click image for larger version. 

Name:	20120720113418.gif 
Views:	266 
Size:	19.4 KB 
ID:	1988

  7. #7
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    The insert image window doesn't work in my browser, so I attach an image. The std seems clear that copying means using a copy constructor or copy assignment operator.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	2012-07-20_190536.png 
Views:	232 
Size:	10.9 KB 
ID:	1989  

  8. #8
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Then I guess that g++ and VC++ are both correct.

  9. #9
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Missed your reply until now...I think VC++ isn't correct because it uses neither the copy constructor nor assignment operator.
    And, BTW, with the function signature given by the standard it's impossible to do what MS does using C++. Or do I miss something?
    Last edited by m^2; 24th July 2012 at 19:28.

  10. #10
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    It seems like all of these would be legal ways to initialize the elements of a vector:

    - Call the default constructor n times, once for each element.
    - Call the default constructor for one object and call the copy constructor n-1 times.
    - Call the default constructor once for a temporary object, copy constructor n times, and destroy the temporary (g++).
    - Call the default constructor, copy constructor, and destructor n times each (VC++).

    The first method seems simplest. Don't know why neither compiler does it this way.

  11. #11
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Wait... your previous http://encode.ru/threads/1578-std-ve...ll=1#post29955 suggested that VC++ does the 1st thing. Does it really create N temporary objects?
    As to the 1st thing, it's impossible in C++ because the vector constructor gets an object and a number of similar ones that are supposed to be constructed. It can't tell it the object was created with a default or any other constructor. For this reason, the 4th thing is impossible to write in C++ too.

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Creating a vector of 3 elements in g++ does this:

    constructor
    copy constructor
    copy constructor
    copy constructor
    destructor

    In VC++ it does this:

    constructor
    copy constructor
    destructor
    constructor
    copy constructor
    destructor
    constructor
    copy constructor
    destructor

    Which in both cases is different from new[]. new[] does the sensible thing of just calling the default constructor 3 times. But new[] does not initialize built in types like int. So a vector<int> is all 0 but new int[] is not.

  13. #13
    Member Karhunen's Avatar
    Join Date
    Dec 2011
    Location
    USA
    Posts
    91
    Thanks
    2
    Thanked 1 Time in 1 Post
    Unless you explicitly call the constructor by reference, for instance, like V(const V& tocopy), but then you get ugly code. I found a similar discussion on stack overflow.
    Last edited by Karhunen; 24th July 2012 at 23:56. Reason: instance has 2 n

Similar Threads

  1. precomp error: std::bad_alloc
    By lohem in forum Data Compression
    Replies: 12
    Last Post: 11th November 2011, 20:58

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •