Incremental Refresh / Appending / Stacking Data in Paxata

Highlighted
Image Sensor
Morning,

Quick question. I was wondering if Paxata has the capability to perform an incremental refresh. I have a set of excel files I receive every month and would like to stack it on a monthly basis. This would be considered an incremental refresh. Unfortunately, I haven't figured out a way to complete this task in Paxata. 

It doesn't seem paxata supports this. I tried creating a standard excel file then adding a version, but that didn't work. Then tried creating a foundation data set then created another data set which was set to automate (new data) and then appending that on the foundational data set within a project hoping that project would keep all the appended data. However, no dice there as well.

Any thoughts?
Labels (1)
0 Kudos
13 Replies
Highlighted
NiCd Battery
Hello,
I may need a bit of clarification in order to answer your question - are you importing the Excel files locally or from a shared area like Sharepoint?  Paxata does not support the automation of local file imports into our Library (don't want the server reaching out to individual desktops) but we do support automated import from enterprise sources (SFTP, Sharepoint, S3, WASB, databases, etc.).  If you load the latest version of the Excel file into the Library, either on demand if a local file or on a schedule if enterprise system then you can create a project which starts with the original file and Appends the updated file.  When there is a new updated file in the Library you will notice a "refresh datasets" button at the bottom of the Steps panel in the project turns green.  You can select this option and then choose to update the latest version of any new datasets in the project.  You can also set automation options to use "latest version".  Does this help?  Please let me know if I can provide additional clarification.

Thanks!
Martha
0 Kudos
Highlighted
Image Sensor
Morning Martha,

I am importing the excel files locally, which I receive via email. Is there a way to do it with a local excel file?
0 Kudos
Highlighted
NiCd Battery
No, it is not possible to automate the import of local files from your desktop into the Paxata Library.  If you click on a dataset in the Library you do have the option to add a version - this will help simplify the Library so that you can view each version of a dataset vs. adding new versions as a separate/distinct dataset in the Library.Image: https://us.v-cdn.net/6030933/uploads/editor/w3/bvy8fay66ybt.png
0 Kudos
Highlighted
Image Sensor

Interesting. Would it work from a share drive? I could load it to a network share.



0 Kudos
Highlighted
DC Motor
Hello Ychamb,

Paxata does support the Network Share (SMB) connector, so automation can be done if the files are on a network share. I hope this helps! 

Thanks,
Akshay
0 Kudos
Highlighted
Image Sensor
Is it possible to have an incremental refresh from an Enterprise data source such as a Hive table or SQL Server database? 
0 Kudos
Highlighted
NiCd Battery
Import can be based on either the entire table or a query.  If using a query then you could incorporate a condition in the where clause based on current day or some other flag that would capture updated rows.  You can then automate this to run.  I hope this helps!

Martha
0 Kudos
Highlighted
NiCd Battery
Hey Martha,  we came up with a question related to the start of this thread. Is there any documentation that explains how to incrementally append data to a dataset?  This becomes circular at some point and we're having trouble figuring it out.  It seems like you would have to be able to modify a current dataset with a project without having the project create a new dataset?
0 Kudos
Highlighted
Linear Actuator
Hi @bella21,
Here's one way to accomplish this in Paxata:
  1. Import version 1 of the dataset
  2. Create a project with the dataset imported in step 1
  3. Perform necessary actions like shaping, computed columns, etc. 
  4. Add a lens, and publish the output of the lens to Library 
  5. Add a new version of the dataset imported in Step 1
  6. Edit the steps to add an Append step just before the Lens you published
  7. In the append step, Select the output of lens you published in Step 4 back into the project
  8. Use the refresh datasets button to refresh the dataset you started with the latest (version imported in step 5 will replace version imported in step 1)
  9. If incremental data has a tendency to receive duplicates, add necessary Deduplicate step to address them
  10. Automate the project
  11. Select "Use Latest Version" for both inputs datasets
Please let me know if you need additional help with this based on the specific use-case, Aaron.  

Regards,
Shyam Ayyar
0 Kudos
Highlighted
NiCd Battery
Hi @bella21 ,

To add to @sayyar response, please let me know if it helps to have a quick Zoom session and build a working prototype. I will be happy to set it up.

With Best Regards
Sudheer Kumar


0 Kudos
Highlighted
Image Sensor

Thanks @sayyar,


That seems like a pretty challenging mulit-tiered process in order to do a simple stacking. However, thanks for figuring this out. Instead of doing it that way with multi-steps. I created a historical master view to be refreshed as the foundation than stacked with new data pulled from sharepoint.

On a side note, other programs have the ability to recognize changes in data per a specific data element. Maybe this would be a great idea to institute in Paxata?

0 Kudos
Highlighted
Linear Actuator
@ychamb, 
I am glad that you are able to perform this in another manner as well. Would you please share the steps you took to accomplish it so that other users will also benefit? 

Thank you for the feedback on the ability to recognize changes in data. We will look into adding the capability. 

Regards,
Shyam Ayyar

0 Kudos
Highlighted
Image Sensor

Hey morning @sayyar,

As mentioned above.

I created a historical master view to be refreshed as the foundation than stacked with new data pulled from sharepoint.

Basically, automated the historical/foundational piece, then automated the share point data and created a project which stacked them together.

Manual  - However, when my sharepoint data reaches its record limit. I have to export half the records and paste them into the historical master and delete them manually from the sharepoint (eliminate dupes).


0 Kudos