DSpace collection export maximum size

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

DSpace collection export maximum size

Sai Deng

Dear list,

We're trying to export DSpace collections from the front end (Admin Tools- Export Collection, or Export (migrate) Collection), it reports errors for certain collections:

Error with export

The overall size of this export is too large. Please contact your administrator for more information.

The hosting company said that it's because DSPACE was running out of space and did adjustments in the server side (I assume), it still reports the same error. We then searched DSpace-Tech and changed the itemexport maximum size config to "“org.dspace.app.itemexport.max.size = 1024”, it still reports the same error for certain collections.
 
What else can be done to resolve this issue?
 
Our goal actually is to export collections in DSpace to CONTENTdm and eventually to a state-wide sharing system. Currently, we get metadata from the front end (Admin Tools- Export metadata) in csv file for a whole collection, and we acquire the actual bitstreams from the collection export. It involves some manual work to add the bitstream names to the csv file (in spreadsheet). There are probably better ways. If you have any insights, please reply!
Thanks very much!
 

Sophie

Metadata Librarian

UCF Library

 


------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
DSpace-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Reply | Threaded
Open this post in threaded view
|

Re: DSpace collection export maximum size

helix84
Hi Sophie,

the limit you have set is 1 GB, so you probably have some collections
that are larger than that.

Anyway, the limit is there so that you don't attempt to transfer such
large files over the web UI. The command-line tools are not restricted
by these limitations: [1]

So you can use them to export bitstreams and continue to use the CSV
export for metadata if you wish. There's also a command-line version
of that: [2]

But your goal could also be achieved using OAI-PMH [3] [4], which got
a huge performance boost in DSpace 3.0. It can be used to prepare the
metadata in an arbitrary format [5] which can also contain links to
the bitstreams accessible via HTTP (resource policies still apply).

As you can see, there are many ways how you can export data from
DSpace, so choose what you're most convenient with.

[1] https://wiki.duraspace.org/display/DSDOC3x/Importing+and+Exporting+Items+via+Simple+Archive+Format#ImportingandExportingItemsviaSimpleArchiveFormat-ExportingItems
[2] https://wiki.duraspace.org/display/DSDOC3x/Batch+Metadata+Editing?focusedCommentId=34641195#BatchMetadataEditing-ExportFunction
[3] https://wiki.duraspace.org/display/DSDOC3x/OAI#OAI-OAI-PMHServer
[4] https://wiki.duraspace.org/display/DSDOC3x/OAI+2.0+Server
[5] https://wiki.duraspace.org/display/DSDOC3x/OAI+2.0+Server#OAI2.0Server-Add/RemoveMetadataFormats


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
DSpace-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Reply | Threaded
Open this post in threaded view
|

Re: DSpace collection export maximum size

Tim Donohue
Administrator
In reply to this post by Sai Deng
Hi Sophie,

You should be able to disable that maximum size check by either
commenting out that "org.dspace.app.itemexport.max.size" setting in your
dspace.cfg or by setting it to zero (which will also disable the check):

org.dspace.app.itemexport.max.size = 0

After that, it should just be a matter of assuring you have the
necessary space on your server to store the export files.

Your process itself sounds reasonable (assuming you really want to get
everything into an Excel spreadsheet in the end).  The other bulk export
tool that DSpace (versions 1.7.x and above) offers is the "AIP Backup &
Restore tools". This tool can perform an export of all DSpace content
(Communities, Collections and Items) to a set of METS-based Zip files
(called AIPs).

https://wiki.duraspace.org/display/DSDOC18/AIP+Backup+and+Restore

- Tim

On 2/5/2013 8:07 AM, Sai Deng wrote:

> Dear list,
>
> We're trying to export DSpace collections from the front end (Admin
> Tools- Export Collection, or Export (migrate) Collection), it reports
> errors for certain collections:
>
> *Error with export*
>
> The overall size of this export is too large. Please contact your
> administrator for more information.
>
> The hosting company said that it's because DSPACE was running out of space and did adjustments in the server side (I assume), it still reports the same error. We then searched DSpace-Tech and changed the itemexport maximum size config to "“org.dspace.app.itemexport.max.size = 1024”, it still reports the same error for certain collections.
>
>
>
> What else can be done to resolve this issue?
>
>
>
> Our goal actually is to export collections in DSpace to CONTENTdm and eventually to a state-wide sharing system. Currently, we get metadata from the front end (Admin Tools- Export metadata) in csv file for a whole collection, and we acquire the actual bitstreams from the collection export. It involves some manual work to add the bitstream names to the csv file (in spreadsheet). There are probably better ways. If you have any insights, please reply!
>
> Thanks very much!
>
>
>
> Sophie
>
> Metadata Librarian
>
> UCF Library
>
>
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
>
>
>
> _______________________________________________
> DSpace-tech mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
DSpace-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Reply | Threaded
Open this post in threaded view
|

Re: DSpace collection export maximum size

Sai Deng
Tim, "helix84" and Wally,
Thanks very much for the responses!
Have forwarded all the replies to the hosting company and will work with the people there to resolve the problem.

Regarding the ways to get metadata and bitstreams in the collection level from DSpace, I have a quick question:
If we need to do quite some editing for the exported metadata (e.g. adding new fields, splitting information in one field to several fields, changing field names and values etc.), how can this step be integrated to the workflow by either using AIP backup (in METS-based AIP format) or harvesting metadata in an arbitrary format?
Need to read the resources you guys sent. Thanks a lot!
Sophie


-----Original Message-----
From: Tim Donohue [mailto:[hidden email]]
Sent: Tuesday, February 05, 2013 9:47 AM
To: Sai Deng
Cc: [hidden email]
Subject: Re: [Dspace-tech] DSpace collection export maximum size

Hi Sophie,

You should be able to disable that maximum size check by either commenting out that "org.dspace.app.itemexport.max.size" setting in your dspace.cfg or by setting it to zero (which will also disable the check):

org.dspace.app.itemexport.max.size = 0

After that, it should just be a matter of assuring you have the necessary space on your server to store the export files.

Your process itself sounds reasonable (assuming you really want to get everything into an Excel spreadsheet in the end).  The other bulk export tool that DSpace (versions 1.7.x and above) offers is the "AIP Backup & Restore tools". This tool can perform an export of all DSpace content (Communities, Collections and Items) to a set of METS-based Zip files (called AIPs).

https://wiki.duraspace.org/display/DSDOC18/AIP+Backup+and+Restore

- Tim

On 2/5/2013 8:07 AM, Sai Deng wrote:

> Dear list,
>
> We're trying to export DSpace collections from the front end (Admin
> Tools- Export Collection, or Export (migrate) Collection), it reports
> errors for certain collections:
>
> *Error with export*
>
> The overall size of this export is too large. Please contact your
> administrator for more information.
>
> The hosting company said that it's because DSPACE was running out of space and did adjustments in the server side (I assume), it still reports the same error. We then searched DSpace-Tech and changed the itemexport maximum size config to ""org.dspace.app.itemexport.max.size = 1024", it still reports the same error for certain collections.
>
>
>
> What else can be done to resolve this issue?
>
>
>
> Our goal actually is to export collections in DSpace to CONTENTdm and eventually to a state-wide sharing system. Currently, we get metadata from the front end (Admin Tools- Export metadata) in csv file for a whole collection, and we acquire the actual bitstreams from the collection export. It involves some manual work to add the bitstream names to the csv file (in spreadsheet). There are probably better ways. If you have any insights, please reply!
>
> Thanks very much!
>
>
>
> Sophie
>
> Metadata Librarian
>
> UCF Library
>
>
>
> ----------------------------------------------------------------------
> --------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013 and get
> the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
>
>
>
> _______________________________________________
> DSpace-tech mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
DSpace-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Reply | Threaded
Open this post in threaded view
|

Re: DSpace collection export maximum size

helix84
On Tue, Feb 5, 2013 at 4:19 PM, Sai Deng <[hidden email]> wrote:
> Regarding the ways to get metadata and bitstreams in the collection level from DSpace, I have a quick question:
> If we need to do quite some editing for the exported metadata (e.g. adding new fields, splitting information in one field to several fields, changing field names and values etc.), how can this step be integrated to the workflow by either using AIP backup (in METS-based AIP format) or harvesting metadata in an arbitrary format?

In that case, CSV is probably indeed the most convenient option.


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
DSpace-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette