heamy.pipeline module¶
-
class
heamy.pipeline.
ModelsPipeline
(*args)[source]¶ Combines sequence of models.
-
apply
(func)[source]¶ Applies function along models output.
Parameters: func : function
Arbitrary function with one argument.
Returns: PipeApply
Examples
>>> pipeline = ModelsPipeline(model_rf,model_lr) >>> pipeline.apply(lambda x: np.max(x,axis=0)).execute()
-
blend
(proportion=0.2, stratify=False, seed=100, indices=None, add_diff=False)[source]¶ Blends sequence of models.
Parameters: proportion : float, default 0.2
stratify : bool, default False
seed : int, default False
indices : list(np.ndarray,np.ndarray), default None
Two numpy arrays that contain indices for train/test slicing.
add_diff : bool, default False
Returns: DataFrame
Examples
>>> pipeline = ModelsPipeline(model_rf,model_lr) >>> pipeline.blend(seed=15)
>>> # Custom indices >>> train_index = np.array(range(250)) >>> test_index = np.array(range(250,333)) >>> res = model_rf.blend(indicies=(train_index,test_index))
-
find_weights
(scorer, test_size=0.2, method='SLSQP')[source]¶ Finds optimal weights for weighted average of models.
Parameters: scorer : function
Scikit-learn like metric.
test_size : float, default 0.2
method : str
Type of solver. Should be one of:
- ‘Nelder-Mead’
- ‘Powell’
- ‘CG’
- ‘BFGS’
- ‘Newton-CG’
- ‘L-BFGS-B’
- ‘TNC’
- ‘COBYLA’
- ‘SLSQP’
- ‘dogleg’
- ‘trust-ncg’
Returns: list
-
mean
()[source]¶ Returns the mean of the models predictions.
Returns: PipeApply Examples
>>> # Execute >>> pipeline = ModelsPipeline(model_rf,model_lr) >>> pipeline.mean().execute()
>>> # Validate >>> pipeline = ModelsPipeline(model_rf,model_lr) >>> pipeline.mean().validate()
-
stack
(k=5, stratify=False, shuffle=True, seed=100, full_test=True, add_diff=False)[source]¶ Stacks sequence of models.
Parameters: k : int, default 5
Number of folds.
stratify : bool, default False
shuffle : bool, default True
seed : int, default 100
full_test : bool, default True
If True then evaluate test dataset on the full data otherwise take the mean of every fold.
add_diff : bool, default False
Returns: DataFrame
Examples
>>> pipeline = ModelsPipeline(model_rf,model_lr) >>> stack_ds = pipeline.stack(k=10, seed=111)
-