Counting entries in Excel table

I’ve been on a conference the last two days, but try to have a look at this soon. Didn’t you already have a class that could read the Excel file, and find the healthy and sic ones by checking the cell color?

Hi Marco. No, i have not a class that find the healthy and the sick by the colour of the cell. I have to implement a class/method that read an excel file (.xls) and split it in two parts: for example 50% the healthy and 50% the sicks or, if is possible, to choose the percentage before each time execution, but not recognized by the colour (and to add for each of this two tables the first column with the probe set id (AM_1xxxx)). Bye.

OK, I thought that the class that you had uploaded to mediafire in the first post did check the cell color. The file is not available anymore, but I have it on anonther PC. I’ll have a closer look at this beginning of next week.

Hi Marco. The class that you are looking for is this:

import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashSet;
import java.util.LinkedList;


//import util.DivideMatrix;
//import util.MatrixCleaner;
//import filters.ThresholdFilter;
//import math.Mathematics;
//import util.DivideMatrix;
//import util.MatrixCleaner;


import jxl.Sheet;

import jxl.Workbook;

/**
 * 
  This class read file excel
 * and is possible distinguish between sicks and healthy
 * through the patient's name in red.
 */

public class ReaderXLS {

	
				private InputStream inp;
	                        private String [] [] mat;
	                        private HashSet<Integer> sicks;
	                        private HashSet<String> simbols;
	                        private Workbook wb;
	                        private Sheet sheet;
	                        private int columns,rows;
	                        private LinkedList<String [][]> listMatrix;
	
	
				public ReaderXLS(String path) throws FileNotFoundException{
		
					inp = new FileInputStream(path);
		                      
		                        sicks = new HashSet<Integer>();
		
					simbols = new  HashSet<String>();
		
		
					try {
			                        //workbook creation from file excel
		    
						this.wb = Workbook.getWorkbook(inp);
		    
						//get sheet 0 in this case there is only a sheet
		    
						this.sheet = wb.getSheet(0);
		    
		    
						columns = sheet.getColumns();
		    
						rows = sheet.getRows();
		    
		                                this.mat = new String [rows][columns];
		    
		   
						//cycle for matrix creation
		    
						for(int row = 0;row < rows;row++) {
		    	 
						for(int col = 0;col < columns;col++)   {
							// getCell(column, row);
							mat[row][col] =  sheet.getCell(col,row).getContents().toLowerCase();
		    		 
							simbols.add(sheet.getCell(col, row).getContents().toLowerCase());
		    		 
		    		 
		    		 
							//distinction between sick and healthy through the font color!!
//		    		
							if( sheet.getCell(col, row).getCellFormat().getFont().getColour().getValue() == 10 )
								{
//		    			
									sicks.add(col);
//		    		
								}
		        
							}
						}
//		    
						System.out.println("mat "+mat.length+"x"+mat[0].length);
					} catch (Exception ioe) {

					System.out.println("Error: " + ioe);
		
					}
	
				}
	
	


/**
	 
* Build a list of matrix one for each sheet
	 
* @param path
	 
* @param sheetsNumber
	 
* @throws FileNotFoundException
	 */
	

	public ReaderXLS(String path, short sheetsNumber) throws FileNotFoundException{
		
		inp = new FileInputStream(path);
		
		listMatrix = new LinkedList<String[][]>();
		
		try {
			
			//workbook creation from file excel
		    
			this.wb = Workbook.getWorkbook(inp);
		   
		    
			//cycle for read n sheets
		   
			for (int i = 0; i < sheetsNumber; i++) {
		    
				this.sheet = wb.getSheet(i);
		    
		    
				columns = sheet.getColumns();
		    
				rows = sheet.getRows();
		    
		    
				this.mat = new String [rows][columns];
		    
		   
				//cycle for matrix creation
		    
				for(int row = 0;row < rows;row++) {
		    	 
					for(int col = 0;col < columns;col++)   {
		    		 
						mat[row][col] = sheet.getCell(col, row).getContents();
		        
					}
		    
				}

	   		        listMatrix.add(mat);
			
			}   
		
		} catch(Exception ioe) {
		    
		System.out.println("Error: " + ioe);
		
		}
	
	}
	
	


	public LinkedList<String[][]> getListMatrix() {
		
		return listMatrix;
	}

	

	public String[][] getMat() {
		
		return mat;
	}
	
	
	public HashSet<Integer> getSicks() {
		
		return sicks;
	}
	

	public HashSet<String> getSimbols() {
		
		return simbols;
	}

	

	public void writeMatrixOnFile(String path) throws IOException{
		
		BufferedWriter bw = new BufferedWriter(new FileWriter(path));
		
		
			for (int i = 0; i < mat.length; i++) {
			
				for (int j = 0; j < mat[0].length; j++){
				
				bw.append(mat**[j]+"	"+"	");bw.flush();
			
			}
			
			bw.append("
");bw.flush();
		
		}
	
	}
	

	public void writeMatrixOnFile(String path, String [][] mat) throws IOException{
		
		BufferedWriter bw = new BufferedWriter(new FileWriter(path));
		
		
		for (int i = 0; i < mat.length; i++) {
			
			for (int j = 0; j < mat[0].length; j++){
				
				bw.append(mat**[j]+"	"+"	");bw.flush();
			
			}
			
			bw.append("
");bw.flush();
		
		}
	
	}
	


	
//public static void main(String[] args) throws FileNotFoundException {
//	
	
//	ReaderXLS r = new ReaderXLS("XLS_file.xls");

	//	String [][] mat = r.getMat();
//	System.out.println("Ciao");
	
//	System.out.println(mat.length+","+mat[0].length);
	
//	r.printMat(mat);
//	int v [] = {1,2,3};
	
//	DivideMatrix divide = new DivideMatrix(r.getMat(), v );
	
////	
//	String  healthyM[][] = null, sicksM[][] = null;
	
//	try {
//		MatrixCleaner d = new  MatrixCleaner(divide.getHealthyM());
	
//		healthyM = d.getNormalizedMatrix();
	
//		
//		d = new MatrixCleaner(divide.getSickM());

	//		sicksM = d.getNormalizedMatrix();
	
//		
//	} catch (InterruptedException e) {
	
//		e.printStackTrace();
	
//	}
//////	
//////	r.printMat(divide.getSickM());
	
//////	System.out.println("sicksM[0].length= "+sicksM[0].length+" , healthyM[0].length= "+healthyM[0].length);

	////	
////	try {
////		r.writeMatrixOnFile("matrixSicsSNPS.txt",sicksM);

	////		r.writeMatrixOnFile("matriceHealthySNPs.txt",healthyM);

	////
////	} catch (IOException e) {
////		e.printStackTrace();

	////	}
////	
////	try {
////		
////		String xxx [][] = Mathematics.op1(sicksM, healthyM);
	
////		r.writeMatrixOnFile("matXXX.txt", xxx);
	
////		String xxxF [][] ;
	
////		ThresholdFilter tf = new ThresholdFilter(0.00003d, xxx);
	
////		xxxF = tf.getFilteredMatrix();
	
////		r.writeMatrixOnFile("matXXXFFF.txt", xxxF);
	
////	} catch (Exception e) {
////		e.printStackTrace();
	
////	}
//	
//}


	
	//public void printMat(String [][]m){
	
//	for (int i=0; i<m.length; i++) {
        // 		scan rows
	
//	        for (int j=0; j<m[0].length; j++)  
	// 			scan elements row i	            
				
//System.out.print(m**[j]+" "); 
				// print element row	        

				// System.out.println();              
				// end row
	
//}
	
//}
	

}

But this, distinguish between sick and healthy through the patient’s name in red: i want instead, as i wrote in my last post, with the percentage before each time execution (manually in the code or asked if is possible). I wish you a nice day. Thanks.

Ah, I misunderstood that.

So you want to select the sick/healthy ones arbitrarily? Just randomly use e.g. 50% of the columns and declare them as “healthy”? That sounds strange. But this can easily be done with the code that I already posted: It contains this
List sickColumns = …
where you can simply use a list that contains 50% of all table columns…

Yes Marco but i don’t know how. Yes, it sound strange, but they told me to do that. This have to be implemented with any excel file and not only for this (not with the csv). So if, for example, i have an excel file with 70 columns: 35 + 35, the first group are the healthy (or sick) and the second ones the others. But if i have a file with 120, 10, 5, 200, 37, 78, etc. colums the “program” should work the same (and i’d like to add the first column with the probe set id in this two first tables like the others ones).
If it is possible, the best thing to do (maybe) is to choose the percentage (before the execution manually or type in the video before the execution). If it is possible, please help me.

Again: Who decides which are the “healthy” ones and which are the “sick” ones?

Can you confirm that the healthy and sick ones should be chosen randomly, based on a given percentage (like 50:50 or 70:30) ?

Sorry Marco. There’s no problem on which are the sick and the healthy ones: maybe the first are healthy and the second are the sick or vice versa. Yes is based on a given percentage 50:50, 70:30, 20:80, 10:90, … that i have to choose at least manually (if it is possible). Thanks.

Hi Marco. I have implemented this from the original ReaderXLS but just print the matrix but i’m not neither able to print the linkedlist:

import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashSet;
import java.util.LinkedList;

import jxl.Sheet;
import jxl.Workbook;
import jxl.Cell;
import jxl.DateCell;
import jxl.LabelCell;
import jxl.NumberCell;
import jxl.read.biff.BiffException;

public class ReadExcel {

    private InputStream inp;
    private String[][] mat;
    private HashSet<Integer> sicks;
    private HashSet<String> simbols;
    private Workbook wb;
    private Sheet sheet;
    private int columns, rows;
    private LinkedList<String[][]> listMatrix;

    public ReadExcel(String path) throws FileNotFoundException {

        inp = new FileInputStream(path);
        sicks = new HashSet<Integer>();
        simbols = new HashSet<String>();
        listMatrix = new LinkedList<String[][]>();

        try {
            //workbook creation from file excel
            this.wb = Workbook.getWorkbook(inp);

            //get sheet 0 in this case there is only a sheet
            this.sheet = wb.getSheet(0);
            columns = sheet.getColumns();
            rows = sheet.getRows();
            this.mat = new String[rows][columns];

            //cycle for matrix creation
            for (int row = 0; row < rows; row++) {
                for (int col = 0; col < columns; col++) {
                    //getCell(column, row);
                    mat[row][col] = sheet.getCell(col, row).getContents();//.toLowerCase();
                    simbols.add(sheet.getCell(col, row).getContents());//.toLowerCase());

                    //distinction between sick and healthy through the font color!!
                    if (sheet.getCell(col, row).getCellFormat().getFont().getColour().getValue() == 10) {
                        sicks.add(col);
                    }
                }
                listMatrix.add(mat);
            }

            //System.out.println("Mat " + mat.length + "x" + mat[0].length + ":");
        } catch (Exception ioe) {
            System.out.println("Error: " + ioe);
        }
    }

    public LinkedList<String[][]> getListMatrix() {
        return listMatrix;
    }

    public String[][] getMat() {
        return mat;
    }

    public HashSet<Integer> getSicks() {
        return sicks;
    }

    public HashSet<String> getSimbols() {
        return simbols;
    }

    public void printMat(String[][] m) {
        for (int i = 0; i < m.length; i++) {
            for (int j = 0; j < m[0].length; j++) {
                System.out.print(m**[j] + " ");
            }
            System.out.println();
        }
    }

    public void writeMatrixOnFile(String path) throws IOException {
        BufferedWriter bw = new BufferedWriter(new FileWriter(path));

        for (int i = 0; i < mat.length; i++) {
            for (int j = 0; j < mat[0].length; j++) {
                bw.append(mat**[j] + "	" + "	");
                bw.flush();
            }
            bw.append("
");
            bw.flush();
        }
    }

    public void writeMatrixOnFile(String path, String[][] mat) throws IOException {

        BufferedWriter bw = new BufferedWriter(new FileWriter(path));

        for (int i = 0; i < mat.length; i++) {
            for (int j = 0; j < mat[0].length; j++) {
                bw.append(mat**[j] + "	" + "	");
                bw.flush();
            }
            bw.append("
");
            bw.flush();
        }
    }

    public static void main(String[] args) throws FileNotFoundException {

        ReadExcel r = new ReadExcel("XLS_file.xls");
        String[][] mat = r.getMat();
        //System.out.println(mat.length + "," + mat[0].length);
        r.printMat(mat); //How can i print the linkedlist and then split it?
    }

    public void printMat(LinkedList<String[][]> lm) {

        for (int i = 0; i < lm.size(); i++) {
            System.out.println(lm.get(i));
        }
    }
}

I hope in your help.

I’ll see how this may be combined with the rest of the program in the next few days.

Again: All this was NOT intended to become an “Application”. If you want an Application, YOU have to write it. Apart from that: All this would be much less hassle if you had taken the time and effort to describe the problem and the intention in detail. Messing around, guessing what you might need, is no fun.

However, attached you find an updated version. It can read the data directly from the XLS, and will determine the “sick” ones based on the red font color of the table header. Where does the list of columns come from that you mentioned in this post: http://forum.byte-welt.de/showthread.php?p=17172#post17172 ? In the attached version, you can switch between the predefined list, and the list of ALL values that occur in the original table by changing the method that is called in “createRelevantColumnsList”.

bye

I’m sorry Marco, but i did not even know, so i could not tell you. I don’t know where the list of columns come from (they told me that “are values” nothing else).
Now i have 3 problems:

  1. The last row of sick and healthy table has all values “null” and the last row of the sick and healthy count table are all with zero why ?

  2. I changed the XLS_file from 29 to 22 colums and from 1932 to 77 rows and now i have a nullpointer exception. This is the Netbeans’ output:


java.lang.NullPointerException
	at jcudafishertestmain.ExcelTableData.createFromExcelFile(ExcelTableData.java:29)
	at jcudafishertestmain.JCudaFisherTestMain.createFromXLS(JCudaFisherTestMain.java:21)
	at jcudafishertestmain.JCudaFisherTestMain.main(JCudaFisherTestMain.java:15)
Exception in thread "main" java.lang.NullPointerException
	at jcudafishertestmain.JCudaFisherTestMain.createFromXLS(JCudaFisherTestMain.java:23)
	at jcudafishertestmain.JCudaFisherTestMain.main(JCudaFisherTestMain.java:15)
Java Result: 1

Why it does not work ?

  1. Last but not least important, how can i add the first column with the AM_1xxxx in each tables ? It is important Marco.
  1. ExcelTableData, change the line
    SimpleTableData result = new SimpleTableData(numRows, columnNames, null);
    to
    SimpleTableData result = new SimpleTableData(numRows-1, columnNames, null);

  2. Post the modified file somehow

  3. You mean for the “Count” tables?

  1. Ok resolved, thanks;

  2. The file is attached;

  3. I mean: Sick table, Healthy table, Sick count, Healthy count.

2.) A simple debug output like
System.out.println("Sheet has “+sheet.getColumns()+” columns, accessing “+c+” gives "+cell);
would have shown that the table still contains 28 columns, but the last ones are empty. You must delete the columns, and not only remove their contents. Then the program works without modification.

3.) Is it just about printing the first column? Just print it, there’s no problem with that. Do you need a table that really contains this column? What do you want to do with the table? However, add this method in SimpleTableData

    public static TableData<Object> combine(TableData<?> other, String columnName, List<?> column, int columnIndex)
    {
        SimpleTableData<Object> result = new SimpleTableData<Object>();
        result.headers = new ArrayList<String>(other.getHeaders());
        result.headers.add(columnIndex, columnName);
        result.data = new ArrayList<List<Object>>();
        for (int r=0; r<other.getNumRows(); r++)
        {
            List<?> otherRow = other.getRow(r);
            List<Object> row = new ArrayList<Object>(otherRow);
            row.add(columnIndex, column.get(r));
            result.data.add(row);
        }
        return result;
    }

and this Method in JCudaFisherTestMain

    private TableData<Object> addFirstColumn(TableData<?> tableData)
    {
        return SimpleTableData.combine(tableData, "ID", table.getColumn(0), 0);
    }

Then you can use this method to add the first column to an existing table

TableData<Object> sickTableWithOneMoreColumn = addFirstColumn(sickTable);

So how long should we continue with that? I’ll probably not invest much more time for that. I like to help you with CUDA and JCuda where I can, but don’t see a reason for writing such an application and doing YOUR work.

  1. Ok, it works, thanks;

  2. I need a table that really contains this column and then print it, if possible. I have done what you say but i don’t know where i have to use this method:

TableData<Object> sickTableWithOneMoreColumn = addFirstColumn(sickTable);

Maybe in the execute() in the JCudaFisherTestMain ?

If it is possible i ask you to describe how this application works in “detail”:

  1. Reads the excel file (distinguish sick and healthy through the colour of the table headers);
  2. Create the array list;
  3. Create the sick and the healthy table and the create the count table;
  4. Create (?) the ptx file and then use CUDA/Jcuda to do the work (?);
  5. … ?

You have right Marco. I believe that the more is being done, maybe some little changes to the layout, nothing else. After this work, i will not annoy you ever again.

You can use the line at any place inside the main class, in order to create a new Table from an existing one.

Before

System.out.println("Sick table");
System.out.println(TableDataPrinter.toString(sickTable));

After:

System.out.println("Sick table");
TableData<Object> sickTableWithOneMoreColumn = addFirstColumn(sickTable);
System.out.println(TableDataPrinter.toString(sickTableWithOneMoreColumn ));

In how much detail should I describe what the program is doing? If you had done it on your own, you could also describe it on your own. Yes, it reads the XLS and creates a “TableData” with the original table, then computes a “TableData” that contains the ‘counts’, then creates an array (in the “Tasks” class) containing the values (h0, h1, s0, s1) for which the TwoTailedP should be computed, creates a PTX from the CU and executes the kernel, wich receives the (h0, h1, s0, s1) array as its input.

Ok Marco, now all works fine. I know but, if it is possible, something like that maybe with more details otherwise it’s enough. Thanks.

Don’t get me wrong. Again, I’d like to help you. And if this could turn out to be an application that could be considered as a nice use-case demonstrating the applicability of CUDA/JCuda, I could put more effort in that. But I can’t do that as long as this goal (or the goal at all) is not clear. And in any case, messing around with trivialities like printing some data structures to the console is not what the focus should be on. (Why don’t you show these tables in Swing, by the way?)

Hi Marco. Now i have to add a gui with the possibilities to choose the xls file and then show these tables (in a swing gui or something like that) and then create an executable file (like a jar). How can i do ?