4.1.1 Data Manipulation
Contents
Worksheet
Selecting and Coloring Worksheet Cells
Origin has two x-functions for conditional selection - wcellsel and wxt; The former(wcellsel) is cell-based on an arbitrary range of cells in a worksheet. The latter(wxt) is equivalent to the worksheet extract dialog that works at the column level.
The following code uses wcellsel X-function to select cells that meet a condition, and then uses wcellcolor to color them.
// create new book with some columns and fill with random data newbook; newsheet cols:=6 xy:="Y"; loop(i,1,6){wcol(i)=normal(100)}; // select all cells with value greater than or equal to 0.5 and color them wcellsel 1:end c:=ge v:=0.5; wcellcolor c:=color(green); wcellcolor c:=color(red) type:=1; wcellsel; // this deselects cells so colors show
Coloring Worksheet Rows by wxt
While the wcellsel function can select any worksheet cells, it can only select those cells with a simple condition for the entire range of cells. If you only need to select rows, and you need more precise conditions, then you can use the wxt X-Function, which is basically the LabTalk version of the Worksheet Query dialog. Fill two column with random numbers before running the following example:
//select rows (sel:=1) that meet condition //col(2)>0.3 for the active worksheet wxt "col(2)>0.3" sel:=1; //change the color of the rows of the selected cells wcellcolor c:=color(green); //reset the worksheet selection so you can see the colors correctly wcellsel;
Using wxt with Text Columns
There are special considerations when the condition involves text strings. The following example shows a more complicated condition to select based on both a text column and a numeric column.
//import a sample file that has make of cars and their various characteristics fname$=system.path.program$ + "Samples\Statistics\automobile.dat"; newbook; impasc options.sparklines:=0; // Select all rows in worksheet with make=Acura and Gas Mileage > 11 // Use range variable to make query more readable than "col(Make)[i]$" range car=col(Make); // Using string variable to avoid having to write "Acura" inside " " string str="Acura"; //wxt will translate "=" to "==", "and" to "&&" wxt "car=str$ and col(Gas Mileage)>11" sel:=1; // Select the rows wcellcolor c:=color(green); // Color those rows wcellsel; // Clear the selection
Copy and Paste Data to New Sheet and Apply Transpose
This code uses copydata X-function to copy a block of numeric data to a new worksheet and then use wtranspose X-function to transpose the data.
//Create a new worksheet with six columns of random number newbook; newsheet cols:=6 xy:="Y"; loop(i,1,6){wcol(i)=normal(100)}; //Highlight a block in the worksheet worksheet -s 3 3 4 6; //Copy the highlighted block into a new worksheet copydata orng:=<new>!<new> clear:=1; //Transpose the resulting worksheet wtranspose;
copydata X-Function doesn't work for text data. So if the data block has both numeric and string data, please use range -v notation to define blocks as range variables and assign original block to new block to paste the value. Then use wtranspose to transpose the data.
//Import a sample data into a new book fname$=system.path.program$ + "\Samples\Statistics\Protein Consumption in Europe.dat"; newbook; impasc; //Define a block of string data; int nc = 4; // number of columns in the block int nr = 10; // number of rows in the block int fi = 4 + nr; // final row index of the block range -v r1 = 1[4]:$(nc)[$(fi)]; // Set input range //Create a new worksheet ; newsheet; wks.ncols = nc; int fo = 1 + nr; // final row index of result range range -v r2=1[1]:$(nc)[$(fo)]; // Set result range //assign values in r1 to r2 r2=r1; //Transpose the block wtranspose; // Set the first row as Long name wrow2label longname:=1
Splitting and Extracting Data to New Sheets
This code demonstrates the use of the discfreqs and wxt X-Functions as well as simpler data manipulation to extract data from a worksheet and place each discrete set into its own sheet. The discfreqs X-Function is an OriginPro-only feature.
//create a new book and import data file newbook sheet:=1; string fname$ = system.path.program$ + "Samples\Data Manipulation\US Metropolitan Area Population.dat"; impASC; //store the values of the desired column in a range range cc = [%(page.name$)]%(wks.name$)!col(4); string originalsheet=%(wks.name$); // create two new columns for our split variables wks.nCols = wks.nCols + 2; wks.col5.lname$=City ; wks.col6.lname$=State; // Read over each entry in the range... for (ii=1; ii<=cc.getsize(); ii++) { initialItem$=cc[ii]$; //create a string of the cell... // write the variables to the iith cell of our two new columns... Col(City)[ii]$=initialitem.gettoken(1,',')$; Col(State)[ii]$=initialitem.gettoken(2,',')$; } // Analyze the new "State" row to find the discrete "State" designations discfreqs irng:=col(State) rd:=[<input>]Frequency!; // Store the designations in a range range ff = [%(page.name$)]Frequency!col(1); // Loop over each unique State value from the discrete frequency output for (jj=1; jj<=ff.getsize(); jj++) { // This string allows the conditional to point to a different // row of the frequency table with each iteration string state$ = [%(page.name$)]2!col(1)[jj]$; // The wxt X-Function requires that the user inclue the "[i]" term // when using a string as part of the conditional test string statequote$ = ""%(state$)""; string cond$ = "col(6)[i]$==" + statequote$; // Wxt extracts each row of the original sheet corresponding // to the currently tested state to a new sheet wxt test:=cond$ iw:=1 ow:="%(state$)"; }
Subgroup a Worksheet to New Worksheets
This code demonstrates the use of the discfreqs and wxt X-Functions as well as simpler data manipulation to extract data from a worksheet and place each discrete set into its own sheet. Tree and StringArray variables are used to hold the results of discrete frequency count. The discfreqs X-Function is an OriginPro-only feature.
//create a new book and import data file newbook sheet:=1; string fname$ = system.path.program$ + "Samples\Statistics\automobile.dat"; impASC; //store the values of the desired column in a range range makeCol = [%(page.name$)]%(wks.name$)!col(make); range sourceWks = [%(page.name$)]%(wks.name$)!; //create a tree to hold the result of frequency count and do frequency //count to "Make" column to find the discrete "Make" designations //this X-Function is only available in OriginPro tree tr; discfreqs irng:=makeCol rd:=tr; //create a string array and put the "Make" designations into it StringArray sa; sa.append(tr.FreqCount1.Data1); //if sa contains data //loop over each unique "Make" designation //and extract data from original worksheet to new worksheets if (sa.GetSize() != NANUM) { for (ii=1; ii<=sa.GetSize(); ii++) { //this string is used to generate condition to //point to different "Make" designation stored //in string array for each iteration string newWk$ = sa.GetAt(ii)$; //construct tested condition for wxt string cond$ = "makeCol$==" + newWk$; //wxt extracts rows of the original sheet according to //the currently tested state and put them to a new //worksheet named by the corresponding designation wxt test:=cond$ iw:=sourceWks ow:=[<input>]<new %(newWk$)>; } }
Masking the Outliers
This script demonstrates the use of rowstats and wcellmask X-Functions to perform descriptive statistics on rows and mask the cell whose value is defined as outlier.
// create a new book and import data file newbook sheet:=1; string fname$ = system.path.program$ + "Samples\Graphing\Contour.dat"; impASC; // get number of columns and rows int nCols = wks.nCols; int nRows = wks.nRows; // add two columns for mean and sd col(nCols+1)[L]$=Mean; col(nCols+2)[L]$=SD; // perform row stats rowstats -r 0 irng:=col(1)[1]:col($(nCols))[nRows] mean:=col(Mean) sd:=col(SD); // mask outlier for(int ii=1; ii<=nRows; ii++) { double dLower = col(Mean)[ii]-col(SD)[ii]; // mean-sd double dUpper = col(Mean)[ii]+col(SD)[ii]; // mean+sd for(int jj=1; jj<=nCols; jj++) { double dOutlier = col($(jj))[ii]; // get cell value if(dOutlier<dLower || dOutlier>dUpper) { wcellmask irng:=col($(jj))[ii]; // mask the cell } } }
Matrix
RGB Image Matrix
While a matrix contains information on how many rows and columns to display, it is internally stored as a linear array. This fact can be used to copy data from a matrix to a worksheet column as in this instance where we
- Import an image file
- Extract the Red, Green and Blue (RGB) channels (matrix to matrix)
- Copy the channel data (matrix to column)
- Construct a string representation of the RGB values
- Calculate a histogram of RGB values
// Point to a sample image and import fname$ = SYSTEM.PATH.PROGRAM$ + "Samples\Image Processing and Analysis\Flower.jpg"; impImage; // Split the RGB into separate Red, Green and Blue matrices imgRGBsplit r:=[RGBOUT]RED!1 g:=[RGBOUT]GREEN!1 b:=[RGBOUT]BLUE!1; // Create a new Workbook newbook; wks.ncols = 6; wks.col1.lname$ = Red; wks.col2.lname$ = Green; wks.col3.lname$ = Blue; wks.col4.lname$ = RGB; wks.col5.lname$ = Value; wks.col6.lname$ = Count; // Worksheet ranges range raDR = 1, raDG = 2, raDB = 3, raDRGB = 4; // Matrix ranges range raR = [RGBOUT]RED!1; range raG = [RGBOUT]GREEN!1; range raB = [RGBOUT]BLUE!1; // Copy data from Matrix to Workbook by range assignment raDR = raR; raDG = raG; raDB = raB; // Create a string showing the RGB triplet col(D) = col(1)$ + " " + col(2)$ + " " + col(3)$; // Calculate the histogram discfreqs irng:=raDRGB rd:=col(Value);
Convert Multiple Matrices into Worksheets
Suppose we have scanned X, Y, and Intensity data stored in a matrixbook for every height Z, and we have 20 such matrixbooks (scaen at 20 heights in Z direction) with Long Name as "Z2um", "Z4um", "Z6um"..., in which the number between "Z" and "um" means the Z height. Now we want to combine all scanned X, Y, Intensity matrices into one worksheet of XYZI structure.
This is common routine for 3D confocal microscopy images and time-lapse images.
doc -e M { string strMat$ = page.longname$; strMat$ = strMat.Between("Z","um")$; // Get Z Values from Matrixbook Long Name int ZHeight = %(strMat$); m2w method:=xyz; // Convert Matrix to XYZ wks.addcol(); // Add Z column csetvalue col:=col(4) formula:="ZHeight"; // Set Value from Numeric Extracted from Matrix Long Name } //Append converted worksheets into one string strPWKB$ = page.name$; // Let Current Workbook as Parent Workbook string strPWKS$ = wks.name$; // Remember Current Worksheet for Appending Other Sheets doc -e W { string strTempWKB$ = page.name$; // Book Short Name of Current Workbook if (strTempWKB$ != strPWKB$) // If not Parent Workbook, Start Appending { wAppend irng:=([%(strPWKB$)]1!, [%(strTempWKB$)]1!) method:=row ow:=[%(strPWKB$)]1!; // Appending win -c %(strTempWKB$); // Delete Appended Worksheets } else continue; } dataset sps = {4,1}; dataset sodr = {1,1}; wsort nestcols:=sps order:=sodr; // Sort Appened Worksheets by Z (Height) colmove rng:=col(4) operation:=left; // Move Z column to be 3rd Column wks.col1.lname$="X"; wks.col2.lname$="Y"; wks.col3.lname$="Z"; wks.col4.lname$="Intensity"; page.longname$= "XYZI Table"; page.title=1; // Naming
Normalize a Matrix
Internally, Origin'x matrices are one dimensional vector-like arrays. Therefore mathematical operations performed are them are similar to those of vectors rather than true matrices.
// Create a matrix newbook mat:=1; mdim cols:=10 rows:=10; msetvalue formula:="rnd()"; // Normalize it to [0, 1] vnormalize ix:=mat(1) method:=range ox:=<input>;